digitalmars.D - enforce()?

bearophile (5/5) Jun 15 2010 I have counted about 200 usages of std.contracts.enforce() inside Phobos...

Andrei Alexandrescu (5/10) Jun 15 2010 AssumeSorted was an experiment. I think it has drawbacks that I don't

Bernard Helyer (7/8) Jun 16 2010 Please don't start replying to queries in this fashion. Not everyone

Andrei Alexandrescu (9/19) Jun 16 2010 All right, all right.

Lars T. Kyllingstad (7/25) Jun 16 2010 I think any confusion regarding this may stem from the fact that enforce
Walter Bright (3/10) Jun 16 2010 Yes, I agree it is extremely important to separate the concepts of contr...

Steven Schveighoffer (5/14) Jun 16 2010 Hm... what are the drawbacks (besides it not being enforced)? I thought...

Andrei Alexandrescu (45/61) Jul 17 2010 Sorry I took so long (over one month!) to reply to this. I've delayed

Steven Schveighoffer (10/72) Jul 19 2010 Just thinking out loud here, couldn't you use the predicate already in

Andrei Alexandrescu (5/12) Jul 19 2010 That's a good idea. The find predicate that could be derived from

Steven Schveighoffer (8/20) Jul 19 2010 You're welcome :)

Andrei Alexandrescu (4/25) Jul 19 2010 You mean like this? :o)

Steven Schveighoffer (24/52) Jul 19 2010 Yep. I realized after I wrote this that you probably were already doing...

Andrei Alexandrescu (28/69) Jul 19 2010 Yah, it's quite the STL classic. STL commonly defines implicitly

Steven Schveighoffer (16/21) Jul 19 2010 Sorry for the delay, I've been very busy at work, and I wanted to slip i...

Andrei Alexandrescu (15/36) Jul 19 2010 Walter told me that union is instrumental to keeping the compiler in the...

Steven Schveighoffer (29/69) Jul 19 2010 I don't pretend to know what ominous problems Walter knows about regardi...

Andrei Alexandrescu (5/46) Jul 19 2010 I don't think so (applied to all of the above) for reasons of various
bearophile (7/13) Jul 19 2010 How much more hidden shit like this do I have to see?

Steven Schveighoffer (18/35) Jul 19 2010 What's so horrible about it? It's a corner case. If you were allocatin...

bearophile (8/19) Jul 19 2010 4-word structs are quite common. It's not a common corner.

Steven Schveighoffer (32/54) Jul 19 2010 You mean it *is* a common corner? I agree, and I think the bug report i...

bearophile (5/9) Jul 19 2010 I see, I didn't know this. Sorry for losing my temper Steven...

Steven Schveighoffer (16/25) Jul 19 2010 Hm... unfortunately, I think you will end up in the same boat. Because ...

Andrei Alexandrescu (8/12) Jul 19 2010 I think this characterization is a bit inaccurate because it suggests

Steven Schveighoffer (5/16) Jul 19 2010 There is a cost though... which was my point. Isn't everyone always

bearophile (6/8) Jul 19 2010 RAM is cheap, but the CPU doesn't used RAM, it mostly uses L1 cache (and...

BLS (25/27) Jul 19 2010 A few month ago (12-15) I have also made that suggestion to Steve.

Lutger (3/12) Jun 15 2010 I'd think of it this way: enforce() is part of defensive programming, an...

Jonathan M Davis (15/23) Jun 16 2010 That's probably a pretty good way of putting it. It's essentially the

=?UTF-8?B?QWxpIMOHZWhyZWxp?= (25/29) Jun 16 2010 Makes sense.

Simen kjaeraas (12/30) Jun 16 2010 Seeing as how Error is supposed to be unrecoverable, and Exception might...
Jonathan M Davis (17/57) Jun 16 2010 Well, in a sense, the fact that assertions throw is an implementation de...

=?UTF-8?B?QWxpIMOHZWhyZWxp?= (14/15) Jun 16 2010 I can see two benefits:

Walter Bright (4/10) Jun 16 2010 The difference is not based on those 3 points, but on what Andrei wrote ...

Ary Borenszweig (5/15) Jun 16 2010 Could you please explain them? There are many people here that don't

Steven Schveighoffer (19/35) Jun 16 2010 I think of enforce as a convenient way translating an error in an

Lars T. Kyllingstad (7/44) Jun 16 2010 It also adds a file and a line number to the error message, so the
Alex Makhotin (7/10) Jun 16 2010 So why not concatenating the two and rename it to exactly 'throwif'?

Andrei Alexandrescu (4/11) Jun 16 2010 Well throwif describes mechanism and enforce describes intent. After all...
Bruno Medeiros (6/9) Jun 17 2010 Indeed, especially given that other code program may use throwing as

Leandro Lucarella (21/59) Jun 16 2010 So maybe throw_if() would be a better name =)

Andrei Alexandrescu (11/63) Jun 16 2010 I think there is no real need for exception hierarchies. I occasionally

dsimcha (12/17) Jun 16 2010 IMHO the presence of a simple method of handling errors, even if it's fa...
Michel Fortin (48/114) Jun 16 2010 The need is not really for a hierarchy. The hierarchy serves the need,
Leandro Lucarella (17/42) Jun 16 2010 Exception hierarchy is only one way to discriminate error types. Extra
Jonathan M Davis (19/53) Jun 16 2010 I think that exception hierarchies can be quite useful, but in most case...

Jason Spencer (10/10) Jun 16 2010 I think about it roughly this way (in reverse priority):

Walter Bright (14/23) Jun 16 2010 It has nothing to do with being dumb, as it is not obvious.

Lutger (18/46) Jun 16 2010 I am not so sure about this last point, usually you want to fail but per...

Simen kjaeraas (11/33) Jun 16 2010 How did you end up with an email system that is so horribly broken that

Lutger (24/62) Jun 16 2010 Not Errors, it is not in D and does not distinguish between Errors and

Simen kjaeraas (20/39) Jun 17 2010 ay

Walter Bright (5/8) Jun 17 2010 The contract failing means you do not know what went wrong. That means t...

Lutger (4/7) Jun 16 2010 This is the question: should I segfault on a handwritten letter even if ...

Simen kjaeraas (12/20) Jun 17 2010 Yes. If someone is passing your email system a handwritten letter,

=?UTF-8?B?IkrDqXLDtG1lIE0uIEJlcmdlciI=?= (17/40) Jun 17 2010 Bad example. If someone is passing bad input to your program, it

Walter Bright (15/25) Jun 16 2010 First you need to decide if it is a program bug or not. If it is not a p...

Lutger (5/25) Jun 16 2010 I didn't really get this point from your articles on the subject, but th...

Walter Bright (2/5) Jun 16 2010 Exactly.

Bruno Medeiros (15/31) Jun 17 2010 I would go further and state that anything outside the direct control of...

Walter Bright (2/15) Jun 17 2010 That's a reasonable way of looking at it.
Sean Kelly (2/37) Jun 18 2010 Right. I'd say contracts are to catch logic errors.

Ary Borenszweig (2/26) Jun 16 2010 Ah, ok, now I understand. Thanks.

Michel Fortin (8/11) Jun 16 2010 True.

Andrei Alexandrescu (5/16) Jun 16 2010 You're right! I think Lars' suggestion is sensible - we should move

Lars T. Kyllingstad (11/29) Jun 17 2010 A few suggestions (even though I still think it belongs in object.d), in...

Andrei Alexandrescu (7/36) Jun 27 2010 We haven't reached consensus on where to put enforce() and friends. Any

Simen kjaeraas (4/7) Jun 27 2010 Sounds good.
Jonathan M Davis (6/16) Jun 27 2010 std.exception sounds like a good plan. I'm not overly fond of any of the...
Sean Kelly (2/8) Jun 28 2010 The trace functionality already exists in druntime. As for exceptions, ...

Rory McGuire (4/15) Jun 28 2010 How does one get a print out of the stack trace then? Is it a setting or...

Sean Kelly (2/20) Jun 28 2010 I should qualify my original statement by saying that it's only implemen...

Rory McGuire (22/47) Jun 28 2010 Is there a way to get the function name/line? I'm using this on ubuntu
Andrei Alexandrescu (4/23) Jun 28 2010 My stack traces look indecipherable on Ubuntu. They only contain module

Lars T. Kyllingstad (4/46) Jun 28 2010 TDPL mentions several times that enforce() is in std.contracts. Doesn't...

Andrei Alexandrescu (4/48) Jun 28 2010 I plan to move it to std.exception in a backward-compatible way (have

torhu (4/9) Jul 05 2010 How will std.exception relate to core.exception? Seems to me having two...

Walter Bright (2/13) Jun 16 2010 I agree completely. enforce must move.

Andrei Alexandrescu (3/17) Jun 16 2010 Where to?

Jonathan M Davis (5/24) Jun 16 2010 I would point out that pretty much nothing in std.contracts actually rel...

Michel Fortin (8/11) Jun 16 2010 I concur: the module is misnamed. The only things not related to error

Michel Fortin (7/16) Jun 16 2010 Oh, forgot about "pointsTo" too. What's the link with contracts, or

Andrei Alexandrescu (4/22) Jun 16 2010 Certain functions (notably swap) must make sure that there's no mutual

Michel Fortin (12/30) Jun 16 2010 Ok, so you're using "pointsTo" to check this in a contract? But isn't

Michel Fortin (6/12) Jun 16 2010 Should have concluded by: "I'm not sure where you *should* put it either...

Walter Bright (2/19) Jun 16 2010 Dunno.

Don (3/23) Jun 16 2010 import std.dunno;

Walter Bright (2/4) Jun 16 2010 cut & print.
=?UTF-8?B?QWxpIMOHZWhyZWxp?= (3/5) Jun 16 2010 Or std.poisson... :p

biozic (3/7) Jun 16 2010 Better name it std.fishy, because std.poisson could be mistaken for a

bearophile (18/18) Jun 19 2010 Sorry for not answering before, I was quite busy (despite in the meantim...

Andrei Alexandrescu (5/9) Jun 19 2010 Walter and I discussed this and concluded that Phobos should handle its

Lutger (3/14) Jun 19 2010 That is sensible. Are private functions (those only called from within P...

Andrei Alexandrescu (4/18) Jun 19 2010 Yes, precisely. (The actual code does not fully obey this intention;

Walter Bright (5/14) Jun 19 2010 I should add that any library that may be used as a dll should have its

Vladimir Panteleev (22/35) Jun 19 2010 I don't see the logic in this...

Walter Bright (7/47) Jun 20 2010 An input to a dll is user input, and should be validated (for the sake o...

Vladimir Panteleev (12/24) Jun 20 2010 I don't understand why you're saying this. Security checks in DLL

BCS (21/49) Jun 20 2010 import my.dll;

Vladimir Panteleev (14/32) Jun 20 2010 A well-designed application needs to validate unsafe user input exactly ...

BCS (13/51) Jun 20 2010 If I didn't write the DLL I'm calling, I'll assume it doesn't check stuf...

Vladimir Panteleev (21/26) Jun 21 2010 If you can't trust the DLL to perform correct user data validation, you ...

Andrei Alexandrescu (11/17) Jun 20 2010 [snip]
Walter Bright (5/13) Jun 20 2010 It's true that whenever user code is executed, that code can do anything...

Vladimir Panteleev (12/25) Jun 20 2010 Yes, but this is a completely different kind of trust (incompetence
Leandro Lucarella (16/32) Jun 20 2010 How can you prevent that? If you pass incorrect data to a DLL, then the

Walter Bright (5/7) Jun 20 2010 Windows has had major legacy compatibility issues because critical third...

Leandro Lucarella (15/24) Jun 20 2010 Luckily I haven't used Windows for about 10 years now =)

Andrei Alexandrescu (3/19) Jun 21 2010 Why is it stupid?

Leandro Lucarella (10/32) Jun 21 2010 Because you're adding unnecessary extra checks, just based on

Vladimir Panteleev (9/34) Jun 21 2010 Walter makes a good point. If someone uses your API in the wrong way and...

Sean Kelly (7/27) Jun 21 2010 If a unrecoverable failure occurs within the process, does it matter whe...

bearophile (11/13) Jun 21 2010 Using Design by Contract is not easy, you need to train yourself to use ...
Walter Bright (9/11) Jun 22 2010 I have, and here's how it's done:

Sean Kelly (3/19) Jun 22 2010 A coworker of mine knows a guy who had workers on that rig and told me t...

Sean Kelly (2/4) Jun 22 2010 I should add that I'm hoping the message passing model in D will help en...

Lutger (3/15) Jun 23 2010 Can we look forward to seeing ipc supported in phobos via the same inter...

Sean Kelly (4/29) Jun 24 2010 Yes. The core send/receive API should work just fine for IPC, and it's

Jacob Carlborg (20/49) Jun 25 2010 I have a serialization library, http://dsource.org/projects/orange/ ,

Jacob Carlborg (4/53) Jun 25 2010 ... should be fixed.
Robert Jacques (3/52) Jun 25 2010 I'll volunteer to help test (and to add JSON capabilities) when you're

Jacob Carlborg (7/64) Jun 25 2010 It's ready to be tested with D1 and Tango. You can also start building a...

BCS (4/7) Jun 22 2010 link by chance?

Andrei Alexandrescu (38/116) Jun 27 2010 Any complex API will face at some point some tension between reusing and...

Rory McGuire (6/22) Jun 21 2010 I think perhaps you mis-understood, it is mostly not stupidity that caus...

Sean Kelly (2/7) Jun 21 2010 Or sometimes simply desperation. There are some classes of apps that re...

Don (2/9) Jun 21 2010 Remember Windows 3.0? File handling involved undocumented API calls!

Adrian Matoga (19/38) Jun 21 2010 It was 15 years ago, at the times of 3.x and 95, when Windows behaved

Vladimir Panteleev (6/24) Jun 21 2010 More like 10, Windows Millennium was the last 9x-based Windows operating...

BCS (12/18) Jun 20 2010 A DLL can work just fine (a.k.a. not explode) and still return garbage a...

Vladimir Panteleev (7/12) Jun 21 2010 I think that for such situations you should ship a debug and release

BCS (6/20) Jun 21 2010 Until you can show me a perf problem, I don't see any point in doing tha...

Simen kjaeraas (7/15) Jun 22 2010 Also, if you do have two different versions, I'll bet you ready money

Sean Kelly (7/23) Jun 22 2010 What I've done with druntime is build checked and unchecked versions. I
Lutger (9/25) Jun 22 2010 Naturally, debug is for debugging, not shipping. Instead one could make ...

Norbert Nemec (23/25) Jun 28 2010 In that case, feel free to compile DLLs with external contract checking

Simen kjaeraas (8/11) Jun 28 2010 And if the application designer finds that his design breaks due to
bearophile (4/7) Jun 28 2010 Is this a positive thing to do? Can this be done? (D must support separa...

Norbert Nemec (49/54) Jun 30 2010 These are good and pragmatic questions that you ask.

Sean Kelly (4/25) Jun 30 2010 I see the choice of "release" for disabling contracts as a huge mistake ...

Norbert Nemec (19/43) Jun 30 2010 That's indeed an interesting aspect: Design by Contract (DbC) and

Jay Byrd (19/86) Sep 10 2010 This is all very confused, and is reflected in D implementing contracts

bearophile (5/6) Sep 11 2010 If you know well the ideas of DbC, and you think there are some problems...

bearophile (1/2) Sep 11 2010 Ignore that 'not', please.
retard (3/18) Sep 11 2010 People are doing it wrong. They shouldn't come and rant here. They shoul...

Norbert Nemec (28/31) Sep 11 2010 In fact, it is yet one step more complex than that: as the name itself

bearophile (5/8) Sep 12 2010 There is also a mixed strategy: to use run-time checks in the callee for...

Norbert Nemec (15/21) Sep 12 2010 Indeed - this would mean a bare, unchecked interface for each function

Danny Wilson (4/17) Jun 28 2010 I like this idea.

Norbert Nemec (29/37) Jun 28 2010 IMHO, this is plain wrong!

Andrei Alexandrescu (3/17) Jun 28 2010 C APIs also check their arguments.

bearophile (4/5) Jun 28 2010 Try again, C doesn't have DbC :-) Norbert Nemec says some good things.

Andrei Alexandrescu (8/13) Jun 28 2010 What I meant to say was that even the standard library of a language

Norbert Nemec (25/35) Jun 30 2010 Indeed, checking input arguments is essential. DbC simply means

Michel Fortin (20/39) Jun 28 2010 With C you don't have the option to turn the checks on or off. It's

Andrei Alexandrescu (3/27) Jun 28 2010 #define NDEBUG

Sean Kelly (2/4) Jun 28 2010 Not the standard C library, as far as I know. Of course, it's also gott...

Andrei Alexandrescu (6/10) Jun 28 2010 Nonono. They check whenever they can. Oftentimes they're unable to check...

bearophile <bearophileHUGS lycos.com> writes:

I have counted about 200 usages of std.contracts.enforce() inside Phobos. Can
you tell me what's the purpose of enforce() in a language that has built-in
Contract Programming?

And what are the purposes of std.contracts.AssumeSorted()? Is it useful for
something?

Bye,
bearophile
(I know this is not the digitalmars.D.learn newsgroup).

Jun 15 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

bearophile wrote:
 I have counted about 200 usages of std.contracts.enforce() inside
 Phobos. Can you tell me what's the purpose of enforce() in a language
 that has built-in Contract Programming?

You need to read TDPL for that :o).

 And what are the purposes of std.contracts.AssumeSorted()? Is it
 useful for something?

AssumeSorted was an experiment. I think it has drawbacks that I don't 
know how to address, so I'll retire it.


Andrei

Jun 15 2010

Bernard Helyer <b.helyer gmail.com> writes:

You need to read TDPL for that :o).

Please don't start replying to queries in this fashion. Not everyone
has the wherewithal to get a copy of a book such as TDPL. Especially
seeing as you're the author, this kind of reply just looks like whoring
for the book. I'm not saying that's what it is, just what it can look
like.

I've got TDPL on the way from Amazon, by the way. I just don't want to
see this reply, and wanted to express my distaste.

Jun 16 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Bernard Helyer wrote:
 You need to read TDPL for that :o).

 
 Please don't start replying to queries in this fashion. Not everyone
 has the wherewithal to get a copy of a book such as TDPL. Especially
 seeing as you're the author, this kind of reply just looks like whoring
 for the book. I'm not saying that's what it is, just what it can look
 like.
 
 I've got TDPL on the way from Amazon, by the way. I just don't want to
 see this reply, and wanted to express my distaste.

All right, all right.

Basically there's a marked difference between contract checking (which 
verifies the architectural integrity of a program) and error handling 
(which deals with errors that occur in correct programs). Contracts help 
with the former, enforce helps with the latter.

The differences are marked enough that TDPL dedicates a separate chapter 
to each.


Andrei

Jun 16 2010

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Wed, 16 Jun 2010 00:18:03 -0700, Andrei Alexandrescu wrote:

 Bernard Helyer wrote:
 You need to read TDPL for that :o).

 
 Please don't start replying to queries in this fashion. Not everyone
 has the wherewithal to get a copy of a book such as TDPL. Especially
 seeing as you're the author, this kind of reply just looks like whoring
 for the book. I'm not saying that's what it is, just what it can look
 like.
 
 I've got TDPL on the way from Amazon, by the way. I just don't want to
 see this reply, and wanted to express my distaste.

 
 All right, all right.
 
 Basically there's a marked difference between contract checking (which
 verifies the architectural integrity of a program) and error handling
 (which deals with errors that occur in correct programs). Contracts help
 with the former, enforce helps with the latter.

I think any confusion regarding this may stem from the fact that enforce
() resides in std.contracts.

Personally, I think it's worth moving it to object.d, but maybe it's too 
late for that?  Anyway, I love enforce() -- it's become my standard error 
handling tool.

-Lars

Jun 16 2010

Walter Bright <newshound1 digitalmars.com> writes:

Andrei Alexandrescu wrote:
 Basically there's a marked difference between contract checking (which 
 verifies the architectural integrity of a program) and error handling 
 (which deals with errors that occur in correct programs). Contracts help 
 with the former, enforce helps with the latter.
 
 The differences are marked enough that TDPL dedicates a separate chapter 
 to each.

Yes, I agree it is extremely important to separate the concepts of contract 
checking from error handling.

Jun 16 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 15 Jun 2010 22:23:15 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 bearophile wrote:
 I have counted about 200 usages of std.contracts.enforce() inside
 Phobos. Can you tell me what's the purpose of enforce() in a language
 that has built-in Contract Programming?

 You need to read TDPL for that :o).

 And what are the purposes of std.contracts.AssumeSorted()? Is it
 useful for something?

 AssumeSorted was an experiment. I think it has drawbacks that I don't  
 know how to address, so I'll retire it.

Hm... what are the drawbacks (besides it not being enforced)?  I thought  
it was a good solution.

-Steve

Jun 16 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 06/16/2010 05:47 AM, Steven Schveighoffer wrote:
 On Tue, 15 Jun 2010 22:23:15 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 bearophile wrote:
 I have counted about 200 usages of std.contracts.enforce() inside
 Phobos. Can you tell me what's the purpose of enforce() in a language
 that has built-in Contract Programming?

 You need to read TDPL for that :o).

 And what are the purposes of std.contracts.AssumeSorted()? Is it
 useful for something?

 AssumeSorted was an experiment. I think it has drawbacks that I don't
 know how to address, so I'll retire it.

 Hm... what are the drawbacks (besides it not being enforced)? I thought
 it was a good solution.

Sorry I took so long (over one month!) to reply to this. I've delayed 
the reply to the point when it could be integrated within the upcoming 
thread about improving std.algorithm.find.

The problem with AssumeSorted is that generally predicates in D (and 
also in most other languages) are not easy to compare. Let's first 
recall AssumSorted's definition. It's just a wrapper:

/**
Passes the type system the information that $(D range) is already
sorted by predicate $(D pred). No checking is performed; debug builds
may insert checks randomly. To insert a check, see $(XREF algorithm,
isSorted).
  */
struct AssumeSorted(Range, alias pred = "a < b")
{
     /// Alias for $(D Range).
     alias Range AssumeSorted;
     /// The passed-in range.
     Range assumeSorted;
     /// The sorting predicate.
     alias pred assumeSortedBy;
}

/// Ditto
AssumeSorted!(Range, pred) assumeSorted(alias pred = "a < b", Range)
(Range r)
{
     AssumeSorted!(Range, pred) result;
     result.assumeSorted = r;
     return result;
}

The recommended way to use the facility is:

int[] a = [ -1, 0, 1, 2, 3, 4, 5 ];
assert(find(assumeSorted(a), 3) == [ 3, 4, 5 ]);

find() uses simple means to detect that its first argument has type 
AssumeSorted and takes advantage of that when searching (specifically by 
doing binary search).

So far so good. The problem ensues when we want to make sure that the 
sorting predicate is in sync with the search predicate. For example, if 
the search predicate is "==" then it's okay to use "<" or ">" as a 
sorting predicate. But searching for "a.zip == b.zip" in a range sorted 
by "a.name < b.name" is not okay.

If predicates were all expressed as strings, probably some string 
manipulation could be done to see whether they are compatible. But as 
things stand, assertSorted has quite limited power.


Andrei

Jul 17 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Sat, 17 Jul 2010 16:25:16 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 06/16/2010 05:47 AM, Steven Schveighoffer wrote:
 On Tue, 15 Jun 2010 22:23:15 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 bearophile wrote:
 I have counted about 200 usages of std.contracts.enforce() inside
 Phobos. Can you tell me what's the purpose of enforce() in a language
 that has built-in Contract Programming?

 You need to read TDPL for that :o).

 And what are the purposes of std.contracts.AssumeSorted()? Is it
 useful for something?

 AssumeSorted was an experiment. I think it has drawbacks that I don't
 know how to address, so I'll retire it.

 Hm... what are the drawbacks (besides it not being enforced)? I thought
 it was a good solution.

 Sorry I took so long (over one month!) to reply to this. I've delayed  
 the reply to the point when it could be integrated within the upcoming  
 thread about improving std.algorithm.find.

 The problem with AssumeSorted is that generally predicates in D (and  
 also in most other languages) are not easy to compare. Let's first  
 recall AssumSorted's definition. It's just a wrapper:

 /**
 Passes the type system the information that $(D range) is already
 sorted by predicate $(D pred). No checking is performed; debug builds
 may insert checks randomly. To insert a check, see $(XREF algorithm,
 isSorted).
   */
 struct AssumeSorted(Range, alias pred = "a < b")
 {
      /// Alias for $(D Range).
      alias Range AssumeSorted;
      /// The passed-in range.
      Range assumeSorted;
      /// The sorting predicate.
      alias pred assumeSortedBy;
 }

 /// Ditto
 AssumeSorted!(Range, pred) assumeSorted(alias pred = "a < b", Range)
 (Range r)
 {
      AssumeSorted!(Range, pred) result;
      result.assumeSorted = r;
      return result;
 }

 The recommended way to use the facility is:

 int[] a = [ -1, 0, 1, 2, 3, 4, 5 ];
 assert(find(assumeSorted(a), 3) == [ 3, 4, 5 ]);

 find() uses simple means to detect that its first argument has type  
 AssumeSorted and takes advantage of that when searching (specifically by  
 doing binary search).

 So far so good. The problem ensues when we want to make sure that the  
 sorting predicate is in sync with the search predicate. For example, if  
 the search predicate is "==" then it's okay to use "<" or ">" as a  
 sorting predicate. But searching for "a.zip == b.zip" in a range sorted  
 by "a.name < b.name" is not okay.

 If predicates were all expressed as strings, probably some string  
 manipulation could be done to see whether they are compatible. But as  
 things stand, assertSorted has quite limited power.

Just thinking out loud here, couldn't you use the predicate already in  
AssumeSorted?  I mean, if you're going to pass AssumeSorted into find, you  
don't want to also specify the predicate as then the range just becomes a  
standard range.

There must be some kind of way to use template constraints to kill the  
predicate arg to find when the range is an AssumeSorted struct.  If not,  
there should be.

-Steve

Jul 19 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 07/19/2010 06:36 AM, Steven Schveighoffer wrote:
 Just thinking out loud here, couldn't you use the predicate already in
 AssumeSorted? I mean, if you're going to pass AssumeSorted into find,
 you don't want to also specify the predicate as then the range just
 becomes a standard range.

 There must be some kind of way to use template constraints to kill the
 predicate arg to find when the range is an AssumeSorted struct. If not,
 there should be.

That's a good idea. The find predicate that could be derived from 
AssumeSorted's predicate pred would be !pred(a, b) && !pred(b, a).

Thanks, Steve.


Andrei

Jul 19 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 19 Jul 2010 09:36:54 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 07/19/2010 06:36 AM, Steven Schveighoffer wrote:
 Just thinking out loud here, couldn't you use the predicate already in
 AssumeSorted? I mean, if you're going to pass AssumeSorted into find,
 you don't want to also specify the predicate as then the range just
 becomes a standard range.

 There must be some kind of way to use template constraints to kill the
 predicate arg to find when the range is an AssumeSorted struct. If not,
 there should be.

 That's a good idea. The find predicate that could be derived from  
 AssumeSorted's predicate pred would be !pred(a, b) && !pred(b, a).

 Thanks, Steve.

You're welcome :)

BTW, you don't need the combo predicate until the very end.  Basically,  
you do a binary search for the first element where pred(a, E) is false  
(where E is the target), and then see if pred(E, a) is also false on that  
element (to test for equality).

-Steve

Jul 19 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 07/19/2010 09:27 AM, Steven Schveighoffer wrote:
 On Mon, 19 Jul 2010 09:36:54 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 07/19/2010 06:36 AM, Steven Schveighoffer wrote:
 Just thinking out loud here, couldn't you use the predicate already in
 AssumeSorted? I mean, if you're going to pass AssumeSorted into find,
 you don't want to also specify the predicate as then the range just
 becomes a standard range.

 There must be some kind of way to use template constraints to kill the
 predicate arg to find when the range is an AssumeSorted struct. If not,
 there should be.

 That's a good idea. The find predicate that could be derived from
 AssumeSorted's predicate pred would be !pred(a, b) && !pred(b, a).

 Thanks, Steve.

 You're welcome :)

 BTW, you don't need the combo predicate until the very end. Basically,
 you do a binary search for the first element where pred(a, E) is false
 (where E is the target), and then see if pred(E, a) is also false on
 that element (to test for equality).

You mean like this? :o)

http://www.dsource.org/projects/phobos/browser/trunk/phobos/std/algorithm.d?rev=1279#L4703


Andrei

Jul 19 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 19 Jul 2010 11:10:03 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 07/19/2010 09:27 AM, Steven Schveighoffer wrote:
 On Mon, 19 Jul 2010 09:36:54 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 07/19/2010 06:36 AM, Steven Schveighoffer wrote:
 Just thinking out loud here, couldn't you use the predicate already in
 AssumeSorted? I mean, if you're going to pass AssumeSorted into find,
 you don't want to also specify the predicate as then the range just
 becomes a standard range.

 There must be some kind of way to use template constraints to kill the
 predicate arg to find when the range is an AssumeSorted struct. If  
 not,
 there should be.

 That's a good idea. The find predicate that could be derived from
 AssumeSorted's predicate pred would be !pred(a, b) && !pred(b, a).

 Thanks, Steve.

 You're welcome :)

 BTW, you don't need the combo predicate until the very end. Basically,
 you do a binary search for the first element where pred(a, E) is false
 (where E is the target), and then see if pred(E, a) is also false on
 that element (to test for equality).

 You mean like this? :o)

 http://www.dsource.org/projects/phobos/browser/trunk/phobos/std/algorithm.d?rev=1279#L4703

Yep.  I realized after I wrote this that you probably were already doing  
it :)

Interestingly, I found that when doing the redblacktree that I tried to do  
some optimization on the lookup of an element.  Basically, while I'm going  
down the tree looking for a single element, I'm using the binary predicate  
to move left or right.  However, if I move left (i.e. it's not less), then  
that could be the element I'm looking for!  So I try the opposite of the  
predicate to see if I should return.

But when I allow multiple identical elements (i.e. multiset), I want to  
find the *first* instance of the element, the code is much simpler.  If I  
move to the left child, I store that as the "Best result so far".  Then at  
the end, I simply run the opposite predicate once on the aforementioned  
best result.

The benefit of running the opposite predicate sooner is that if the  
element is higher in the tree, I'll return quicker, but I think it ends up  
being a wash.  I'll probably change it to be the same as the multi style  
tree.

It all comes from the original code which used an int return for the  
comparison, making it just as simple to detect equality as it is to detect  
less-than.

Maybe I'm the first one to make that mistake :)

-Steve

Jul 19 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 07/19/2010 10:24 AM, Steven Schveighoffer wrote:
 On Mon, 19 Jul 2010 11:10:03 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 07/19/2010 09:27 AM, Steven Schveighoffer wrote:
 On Mon, 19 Jul 2010 09:36:54 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 07/19/2010 06:36 AM, Steven Schveighoffer wrote:
 Just thinking out loud here, couldn't you use the predicate already in
 AssumeSorted? I mean, if you're going to pass AssumeSorted into find,
 you don't want to also specify the predicate as then the range just
 becomes a standard range.

 There must be some kind of way to use template constraints to kill the
 predicate arg to find when the range is an AssumeSorted struct. If
 not,
 there should be.

 That's a good idea. The find predicate that could be derived from
 AssumeSorted's predicate pred would be !pred(a, b) && !pred(b, a).

 Thanks, Steve.

 You're welcome :)

 BTW, you don't need the combo predicate until the very end. Basically,
 you do a binary search for the first element where pred(a, E) is false
 (where E is the target), and then see if pred(E, a) is also false on
 that element (to test for equality).

 You mean like this? :o)

 http://www.dsource.org/projects/phobos/browser/trunk/phobos/std/algorithm.d?rev=1279#L4703

 Yep. I realized after I wrote this that you probably were already doing
 it :)

Yah, it's quite the STL classic. STL commonly defines implicitly 
equivalence in terms of !less(a, b) && !less(b, a) but uses only one of 
the two comparisons until the last leg, when it's testing the opposite way.

 Interestingly, I found that when doing the redblacktree that I tried to
 do some optimization on the lookup of an element. Basically, while I'm
 going down the tree looking for a single element, I'm using the binary
 predicate to move left or right. However, if I move left (i.e. it's not
 less), then that could be the element I'm looking for! So I try the
 opposite of the predicate to see if I should return.

Indeed that's 100% what STL's lower_bound and rb_tree.find do.

By the way, I'm still eagerly waiting for your red-black tree 
implementation. I think it would be pure awesomeness if you massaged the 
red/black bit inside one of the pointers. I figured out a way of doing 
that without throwing off the garbage collector:

union
{
     unsigned byte * _gcHelper;
     size_t _bits;
}
bool setRed() { _bits |= 1; }
bool setBlack() { _bits &= ~(cast(size_t) 1); }
bool isRed() { return _bits & 1; }
RBTreeNode * left()
{
     return cast(RBTreeNode *) cast(size_t) (_bits & ~(cast(size_t) 1));
}

The idea is to leave _gcHelper in there as a valid pointer to either a 
RBTreeNode or a pointer to one byte inside the RBTreeNode. That way the 
GC is never confused - it will keep the node.

I think putting that one bit inside the pointer has important consequences.

I also suggest you read up on "left-leaning red-black trees" for a 
recent alternative approach that simplifies the code a fair amount.


Andrei

Jul 19 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 19 Jul 2010 12:21:36 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:
 By the way, I'm still eagerly waiting for your red-black tree  
 implementation.

Sorry for the delay, I've been very busy at work, and I wanted to slip in  
a couple druntime fixes for array appending.

All that is left really is the unit testing, and making the docs more  
phobos-ish.

 I think it would be pure awesomeness if you massaged the red/black bit  
 inside one of the pointers. I figured out a way of doing that without  
 throwing off the garbage collector:

Yes, that works (BTW, you don't need the union, I hate unions :), just  
substitute _bits for _left everywhere, I think it would even work with a  
moving GC).

But I don't know how important it is to save that extra 4 bytes/node.  A  
redblack node already has 3 pointers in it, the flag puts it to 16 bytes  
instead of overhead instead of 12.  It certainly can be an implementation  
choice.

I can look at left-leaning trees (I've had it on my todo list for  
dcollections too).

-Steve

Jul 19 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 07/19/2010 12:23 PM, Steven Schveighoffer wrote:
 On Mon, 19 Jul 2010 12:21:36 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 By the way, I'm still eagerly waiting for your red-black tree
 implementation.

 Sorry for the delay, I've been very busy at work, and I wanted to slip
 in a couple druntime fixes for array appending.

 All that is left really is the unit testing, and making the docs more
 phobos-ish.

 I think it would be pure awesomeness if you massaged the red/black bit
 inside one of the pointers. I figured out a way of doing that without
 throwing off the garbage collector:

 Yes, that works (BTW, you don't need the union, I hate unions :), just
 substitute _bits for _left everywhere, I think it would even work with a
 moving GC).

Walter told me that union is instrumental to keeping the compiler in the 
know about such shenanigans. What does your idea look like? You mean 
keeping a possibly misaligned RBTreeNode pointer and manipulating that? 
I think that's a bit worse than unions because it transforms a sure 
thing into a maybe works thing.

 But I don't know how important it is to save that extra 4 bytes/node. A
 redblack node already has 3 pointers in it, the flag puts it to 16 bytes
 instead of overhead instead of 12. It certainly can be an implementation
 choice.

 I can look at left-leaning trees (I've had it on my todo list for
 dcollections too).

Sounds great. If the payload is one word, on a 32-bit system we'd have 
20 bytes per node. I seem to recall the current GC can allocate 16 bytes 
and then 32 bytes and then 48 bytes, so with the embedded bit we're 
looking at halving the total allocated size. Not too shoddy! Then the 
relative overhead of that extra word is not felt up until a payload of 
20 bytes, at which point again it jumps to 33%.

I wonder what things look like (alignment, granularity) for the 64-bit 
implementation.


Andrei

Jul 19 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 19 Jul 2010 13:47:38 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 07/19/2010 12:23 PM, Steven Schveighoffer wrote:
 On Mon, 19 Jul 2010 12:21:36 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 By the way, I'm still eagerly waiting for your red-black tree
 implementation.

 Sorry for the delay, I've been very busy at work, and I wanted to slip
 in a couple druntime fixes for array appending.

 All that is left really is the unit testing, and making the docs more
 phobos-ish.

 I think it would be pure awesomeness if you massaged the red/black bit
 inside one of the pointers. I figured out a way of doing that without
 throwing off the garbage collector:

 Yes, that works (BTW, you don't need the union, I hate unions :), just
 substitute _bits for _left everywhere, I think it would even work with a
 moving GC).

 Walter told me that union is instrumental to keeping the compiler in the  
 know about such shenanigans. What does your idea look like? You mean  
 keeping a possibly misaligned RBTreeNode pointer and manipulating that?  
 I think that's a bit worse than unions because it transforms a sure  
 thing into a maybe works thing.

I don't pretend to know what ominous problems Walter knows about regarding  
the compiler's view, but here is what I'm thinking:

If a pointer points to the beginning of a node, and a node has at least  
one pointer in it (which it must, since it's a tree), then pointing one  
byte into the node means you're still pointing at the same block, making  
sure the GC doesn't collect.

Really, the generated code will be exactly the same as your solution, but  
it's less of a misuse of the type system IMO (believe it or not).  I'd  
rather use casts when you are trying to use something that's typed as one  
thing as something else.  When using unions, I usually expect only one  
member of the union to be valid at any one time.

And wouldn't a union be more egregious with the upcoming mostly-precise  
scanner?

 But I don't know how important it is to save that extra 4 bytes/node. A
 redblack node already has 3 pointers in it, the flag puts it to 16 bytes
 instead of overhead instead of 12. It certainly can be an implementation
 choice.

 I can look at left-leaning trees (I've had it on my todo list for
 dcollections too).

 Sounds great. If the payload is one word, on a 32-bit system we'd have  
 20 bytes per node. I seem to recall the current GC can allocate 16 bytes  
 and then 32 bytes and then 48 bytes, so with the embedded bit we're  
 looking at halving the total allocated size. Not too shoddy!

Not quite :)  There is one byte for padding because (insert gasp-inspiring  
music accent) all struct heap allocations are allocated through newArray  
with a size of 1.  I discovered this when working on the array append  
patch.

So even a 16-byte struct requires a 32-byte block.

 Then the relative overhead of that extra word is not felt up until a  
 payload of 20 bytes, at which point again it jumps to 33%.

Most of this is mitigated if you have a custom allocator that allocates an  
array of nodes at once (what I do in dcollections).  As a simple  
implementation, you could allocate enough nodes to be under a certain  
threshold of wasted space.

 I wonder what things look like (alignment, granularity) for the 64-bit  
 implementation.

They must be 8-byte aligned, and have 3 8-byte pointers, so that means at  
least 24 bytes.  If you store an int, then it will still fit in the  
32-byte block.  I don't know what's planned as the minimum size for 64-bit  
GC.

-Steve

Jul 19 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 07/19/2010 01:50 PM, Steven Schveighoffer wrote:
 On Mon, 19 Jul 2010 13:47:38 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 07/19/2010 12:23 PM, Steven Schveighoffer wrote:
 On Mon, 19 Jul 2010 12:21:36 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 By the way, I'm still eagerly waiting for your red-black tree
 implementation.

 Sorry for the delay, I've been very busy at work, and I wanted to slip
 in a couple druntime fixes for array appending.

 All that is left really is the unit testing, and making the docs more
 phobos-ish.

 I think it would be pure awesomeness if you massaged the red/black bit
 inside one of the pointers. I figured out a way of doing that without
 throwing off the garbage collector:

 Yes, that works (BTW, you don't need the union, I hate unions :), just
 substitute _bits for _left everywhere, I think it would even work with a
 moving GC).

 Walter told me that union is instrumental to keeping the compiler in
 the know about such shenanigans. What does your idea look like? You
 mean keeping a possibly misaligned RBTreeNode pointer and manipulating
 that? I think that's a bit worse than unions because it transforms a
 sure thing into a maybe works thing.

 I don't pretend to know what ominous problems Walter knows about
 regarding the compiler's view, but here is what I'm thinking:

 If a pointer points to the beginning of a node, and a node has at least
 one pointer in it (which it must, since it's a tree), then pointing one
 byte into the node means you're still pointing at the same block, making
 sure the GC doesn't collect.

 Really, the generated code will be exactly the same as your solution,
 but it's less of a misuse of the type system IMO (believe it or not).
 I'd rather use casts when you are trying to use something that's typed
 as one thing as something else. When using unions, I usually expect only
 one member of the union to be valid at any one time.

 And wouldn't a union be more egregious with the upcoming mostly-precise
 scanner?

I don't think so (applied to all of the above) for reasons of various 
degree of obviousness, but perhaps it's not worth expanding on such a 
minor issue.

Andrei

Jul 19 2010

bearophile <bearophileHUGS lycos.com> writes:

Steven Schveighoffer:

 Not quite :)  There is one byte for padding because (insert gasp-inspiring  
 music accent) all struct heap allocations are allocated through newArray  
 with a size of 1.  I discovered this when working on the array append  
 patch.

How much more hidden shit like this do I have to see?
I have filed a bug report:
http://d.puremagic.com/issues/show_bug.cgi?id=4487
Maybe Walter has to fix this before porting dmd to 64 bits.


 Most of this is mitigated if you have a custom allocator that allocates an  
 array of nodes at once

The GC is supposed to not suck that much. If I want to do all manually and use
custom allocators then maybe it's better if I start to switc to C language.
Thank you Steven.

Jul 19 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 19 Jul 2010 15:53:46 -0400, bearophile <bearophileHUGS lycos.com>  
wrote:

 Steven Schveighoffer:

 Not quite :)  There is one byte for padding because (insert  
 gasp-inspiring
 music accent) all struct heap allocations are allocated through newArray
 with a size of 1.  I discovered this when working on the array append
 patch.

 How much more hidden shit like this do I have to see?
 I have filed a bug report:
 http://d.puremagic.com/issues/show_bug.cgi?id=4487
 Maybe Walter has to fix this before porting dmd to 64 bits.


 Most of this is mitigated if you have a custom allocator that allocates  
 an
 array of nodes at once

 The GC is supposed to not suck that much. If I want to do all manually  
 and use custom allocators then maybe it's better if I start to switc to  
 C language.
 Thank you Steven.

What's so horrible about it?  It's a corner case.  If you were allocating  
a 20-byte struct, you are wasting 12 bytes per value anyways.  Or what  
about a 36-byte struct?

All I'm saying is the pad byte itself isn't a huge issue, even if it  
didn't exist, there would always be inefficient allocations.  Take the  
redblack tree node for instance.  Get rid of the pad byte, and it's tuned  
for word-size payload.  But go over that, and you're back to pretty much  
wasting 50% of your memory.  If you want to tune your app to be the most  
efficient at memory allocation, you need to study the allocator and learn  
its tricks and nuances.  I think you have some misguided notion that C's  
allocator is perfect, and there's no hidden cost to it.  That's not the  
case.

That being said, it would be good if we could get rid of the 1-byte pad  
for single struct allocations on the heap.  As a bonus, the code will be  
more efficient because it doesn't have to deal with the "array" notion.

-Steve

Jul 19 2010

bearophile <bearophileHUGS lycos.com> writes:

Steven Schveighoffer:
 What's so horrible about it?

Using about 210% of the RAM I was planning to use. And not even saying it in a
small print somewhere in the docs.


 It's a corner case.

4-word structs are quite common. It's not a common corner.


 All I'm saying is the pad byte itself isn't a huge issue, even if it  
 didn't exist, there would always be inefficient allocations.

That is too much inefficient.


 If you want to tune your app to be the most  
 efficient at memory allocation, you need to study the allocator and learn  
 its tricks and nuances.

The allocator can have some overhead, but this is offensive.


  I think you have some misguided notion that C's  
 allocator is perfect, and there's no hidden cost to it.  That's not the  
 case.

I have written plenty code in C and its cost is not that high.

I have seen you have changed my bug report, the first I have signed as 'major'
into an 'enhancement'. And you have said:
 DMD functions exactly as designed.

Then I can answer it's a *design* bug. I feel offended.

Jul 19 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 19 Jul 2010 16:24:04 -0400, bearophile <bearophileHUGS lycos.com>  
wrote:

 Steven Schveighoffer:
 What's so horrible about it?

 Using about 210% of the RAM I was planning to use. And not even saying  
 it in a small print somewhere in the docs.


 It's a corner case.

 4-word structs are quite common. It's not a common corner.

You mean it *is* a common corner?  I agree, and I think the bug report is  
warranted, we should get it fixed.  But again, you have specifically  
designed your input to thwart the system.  Those examples will always be  
possible to construct, unless you have a perfect memory allocator, which  
probably will be so slow that it's unusable.

 All I'm saying is the pad byte itself isn't a huge issue, even if it
 didn't exist, there would always be inefficient allocations.

 That is too much inefficient.

That is the cost of allocation schemes that use fixed size memory blocks.   
Especially when they grow in powers of 2.  Tune your app for it, and you  
won't have this problem.

 If you want to tune your app to be the most
 efficient at memory allocation, you need to study the allocator and  
 learn
 its tricks and nuances.

 The allocator can have some overhead, but this is offensive.

I guess my answer is, tune your app to the allocator.  If you allocate a  
lot of little items, you will go a long way to allocate by allocating an  
array of them instead and use a free list.

  I think you have some misguided notion that C's
 allocator is perfect, and there's no hidden cost to it.  That's not the
 case.

 I have written plenty code in C and its cost is not that high.

So C does not use pools of fixed-size memory?  All its blocks it hands out  
are exactly the size you ask for?

Hm... let me test...

Yep, C does the same thing.  I wrote a small program to allocate 1,000,000  
blocks of a command-line-supplied size.

Up to 12 bytes per block allocates 17MiB, even if the blocks I request are  
1 byte.  For a single pointer, that's 300% overhead.  How horrid, let's  
lambaste the C allocator developers.  WHERE'S THE FINE PRINT!????

16 bytes per block allocates 24.6MiB.  Wait, shouldn't it be 17, surely  
that holds 16MiB of data?  What's going on here?  How can anyone be asked  
to deal with this shit?  Oh wait, I didn't *tune my app for the allocator*.

Above that, the C allocator does a good job of minimizing overhead, but  
that's for a plastic example of simply calling malloc 1M times in a row.   
And C has less to worry about than D when it comes to memory allocation  
and is far more experienced at it.  But it's not perfect in all  
situations.  It's much easier to tune your app to the allocator than tune  
the allocator to your app.

 I have seen you have changed my bug report, the first I have signed as  
 'major' into an 'enhancement'. And you have said:
 DMD functions exactly as designed.

 Then I can answer it's a *design* bug. I feel offended.

A design bug *is* an enhancement :)

-Steve

Jul 19 2010

bearophile <bearophileHUGS lycos.com> writes:

Steven Schveighoffer:
 That is the cost of allocation schemes that use fixed size memory blocks.   
 Especially when they grow in powers of 2.  Tune your app for it, and you  
 won't have this problem.

I did know about the power of 2 allocations for small memory blocks, and I know
it's useful to reduce memory fragmentation. So I have tuned my code for that,
that's why I have several structs 16 bytes long, but now I have to target 15
bytes, that is not a power of 2 :o)


 A design bug *is* an enhancement :)

I see, I didn't know this. Sorry for losing my temper Steven...

Bye,
bearophile

Jul 19 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 19 Jul 2010 17:01:34 -0400, bearophile <bearophileHUGS lycos.com>  
wrote:

 Steven Schveighoffer:
 That is the cost of allocation schemes that use fixed size memory  
 blocks.
 Especially when they grow in powers of 2.  Tune your app for it, and you
 won't have this problem.

 I did know about the power of 2 allocations for small memory blocks, and  
 I know it's useful to reduce memory fragmentation. So I have tuned my  
 code for that, that's why I have several structs 16 bytes long, but now  
 I have to target 15 bytes, that is not a power of 2 :o)

Hm... unfortunately, I think you will end up in the same boat.  Because  
any struct of size 15 is aligned to be on a 16-byte boundary.  From my  
memory, I don't think the array allocation code takes into account if the  
final element coincides with the pad byte, but I may be wrong.  Make sure  
to test this theory before going through and trying to trim bytes off all  
your structs.  I think if you use 12-byte structs, it will fit fine, but  
then of course, you are wasting 25% memory :)

If you can deal with some manual memory management, you may want to  
pre-allocate a large array of the structs and then use a free list to  
"allocate" and "deallocate" them.  This should pack them in as tightly as  
possible with almost no overhead.  Of course, if you depend on the GC to  
free your elements, then it might be more of a burden to change all your  
code to contain manual memory management.

-Steve

Jul 19 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 07/19/2010 03:09 PM, Steven Schveighoffer wrote:
 Take the
 redblack tree node for instance. Get rid of the pad byte, and it's tuned
 for word-size payload. But go over that, and you're back to pretty much
 wasting 50% of your memory.

I think this characterization is a bit inaccurate because it suggests 
that there are gains only for one-word payloads. I think the truth is 
that there are gain for several payloads. Their relative value decreases 
with the size of the payload.

Long story short - the less slack (byte of overhead, bool for red/black 
information), the better (in various quanta and of various values).


Andrei

Jul 19 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 19 Jul 2010 16:41:50 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 07/19/2010 03:09 PM, Steven Schveighoffer wrote:
 Take the
 redblack tree node for instance. Get rid of the pad byte, and it's tuned
 for word-size payload. But go over that, and you're back to pretty much
 wasting 50% of your memory.

 I think this characterization is a bit inaccurate because it suggests  
 that there are gains only for one-word payloads. I think the truth is  
 that there are gain for several payloads. Their relative value decreases  
 with the size of the payload.

 Long story short - the less slack (byte of overhead, bool for red/black  
 information), the better (in various quanta and of various values).

There is a cost though...  which was my point.  Isn't everyone always  
saying around here how cheap memory is these days? ;)

-Steve

Jul 19 2010

bearophile <bearophileHUGS lycos.com> writes:

Steven Schveighoffer:
 Isn't everyone always  
 saying around here how cheap memory is these days? ;)

RAM is cheap, but the CPU doesn't used RAM, it mostly uses L1 cache (and a bit
L2/L3 caches too), and they cost a lot :-) The more space your data structure
uses, the less you can fit in the cache. Today cache effects are important for
the code performance.

This is a nice example, shows how reducing the size of the data structure and
changing its arrangement (the original was a normal tree, transversed for each
pixel) can improve the code performance by something like one order of
magnitude for ray-tracing:
http://www.cs.utah.edu/~bes/papers/fastRT/paper-node12.html

Bye,
bearophile

Jul 19 2010

BLS <windevguy hotmail.de> writes:

On 19/07/2010 18:21, Andrei Alexandrescu wrote:
 I also suggest you read up on "left-leaning red-black trees" for a
 recent alternative approach that simplifies the code a fair amount.

A few month ago (12-15) I have also made that suggestion to Steve. 
Meanwhile I am not that sure that LL RB Trees do offer significant 
complexity reduction... . R. Sedgewick's original implementation in Java 
is not bullet proofed.

Don't get me wrong LL RBTree have a certain appeal but read your self.

--In case that you don't want to use this link :
http://t-t-travails.blogspot.com/2008/04/left-leaning-red-black-trees-are-hard.html
--Here a quote <
Last Monday, I started implementing left-leaning red-black trees, 
expecting to spend perhaps 15 hours on the project. I'm here more than 
60 hours of work later to tell you that left-leaning red-black trees are 
hard to implement, and contrary to Sedgewick's claims, their 
implementation appears to require approximately the same amount of code 
and complexity as standard red-black trees.

Meanwhile I am convinced that Skiplists are more interesting as RBTree 
alternative data-structure. Besidet QT folks are using the skiplist 
algo. for their MAP implementation.

 Andrei, hope you have noticed that Steve's dcollections allow the 
replacement of the underlaying data-structute. ;)
So IMHO let's spend some time in implementing the skiplist data-structure.
Finally > I would like to see std.datastructures. for core 
tree,list,graphs etc structures..
A+.
Bjoern

Jul 19 2010

Lutger <lutger.blijdestijn gmail.com> writes:

bearophile wrote:

 I have counted about 200 usages of std.contracts.enforce() inside Phobos.
 Can you tell me what's the purpose of enforce() in a language that has
 built-in Contract Programming?

I'd think of it this way: enforce() is part of defensive programming, and
contracts are related to software testing. 
 
 And what are the purposes of std.contracts.AssumeSorted()? Is it useful
 for something?
 
 Bye,
 bearophile
 (I know this is not the digitalmars.D.learn newsgroup).

Jun 15 2010

Jonathan M Davis <jmdavisProg gmail.com> writes:

Lutger wrote:

 bearophile wrote:
 
 I have counted about 200 usages of std.contracts.enforce() inside Phobos.
 Can you tell me what's the purpose of enforce() in a language that has
 built-in Contract Programming?

 
 I'd think of it this way: enforce() is part of defensive programming, and
 contracts are related to software testing.

That's probably a pretty good way of putting it. It's essentially the 
difference between when you use assertions and when you use exceptions. 
Assertions assert that something is _always_ true and that if it isn't, the 
program is wrong, while exceptions are for exceptional circumstances (as 
opposed to _never_) and indicate an error of some kind which is likely 
outside the control of the program - such as something happening with the 
file system, user input, or the amount of available memory.

enforce() appears to effectively be the exception equivalent to assert(). 
You use it when you want an exception thrown rather than when you want to 
kill your program due to an error. Unfortunately, the difference between 
when assertions should be used and when exceptions should be used is one 
that is just subtle enough that it often trips people up, even though in 
theory it should be fairly straightforward.

- Jonathan M Davis

Jun 16 2010

=?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:

Jonathan M Davis wrote:

 Assertions assert that something is _always_ true and that if it 

isn't, the
 program is wrong, while exceptions are for exceptional circumstances

Makes sense.

 You use [enforce] when you want an exception thrown rather than when 

you want to
 kill your program due to an error.

To further confuse the issue, assert throws too:

import std.stdio;
import std.algorithm;

void main()
{
     try {
         assert(false);
     } catch (Throwable) {
         writeln("an assertion failed");
     }
}

The difference is just the exception that is thrown. Throwable seems to 
be most general.

 From what I've read so far, I take enforce as a replacement to what it 
exactly is:

if (condition) {
     throw /* ... */;
}

Since I never use assert for that purpose, I take enforce as a shortcut 
for the above.

Ali

Jun 16 2010

"Simen kjaeraas" <simen.kjaras gmail.com> writes:

Ali =C3=87ehreli <acehreli yahoo.com> wrote:

 To further confuse the issue, assert throws too:

 import std.stdio;
 import std.algorithm;

 void main()
 {
      try {
          assert(false);
      } catch (Throwable) {
          writeln("an assertion failed");
      }
 }

 The difference is just the exception that is thrown. Throwable seems t=

o  =

 be most general.

Seeing as how Error is supposed to be unrecoverable, and Exception might=

be recoverable, and both inherit from Throwable, one should only very ve=
ry
rarely catch Exception, and by extension, Throwable. One might in fact
argue that Error and Exception should have no common ancestor but Object=
.


  From what I've read so far, I take enforce as a replacement to what i=

t  =

 exactly is:

 if (condition) {
      throw /* ... */;
 }

That is indeed basically what it is.


-- =

Simen

Jun 16 2010

Jonathan M Davis <jmdavisProg gmail.com> writes:

Ali Çehreli wrote:

 Jonathan M Davis wrote:

  > Assertions assert that something is _always_ true and that if it
 isn't, the
  > program is wrong, while exceptions are for exceptional circumstances

 Makes sense.

  > You use [enforce] when you want an exception thrown rather than when
 you want to
  > kill your program due to an error.

 To further confuse the issue, assert throws too:

 import std.stdio;
 import std.algorithm;

 void main()
 {
      try {
          assert(false);
      } catch (Throwable) {
          writeln("an assertion failed");
      }
 }

 The difference is just the exception that is thrown. Throwable seems to
 be most general.

  From what I've read so far, I take enforce as a replacement to what it
 exactly is:

 if (condition) {
      throw /* ... */;
 }

 Since I never use assert for that purpose, I take enforce as a shortcut
 for the above.

 Ali

Well, in a sense, the fact that assertions throw is an implementation detail 
since that's not the case in all languages. The concepts of assertions and 
exceptions are distinctly different.

However, while assertions do throw in D, they throw AssertErrors which are 
Errors and not exceptions, albeit both are Throwable. So, they're still 
different. You _can_ catch Errors, but you probably shouldn't. I believe 
that they're intended for pretty much unrecoverable errors. The fact that 
they're thrown likely makes it easier to exit the program semi-gracefully - 
or at least makes it easier for the generated program to properly indicate 
an error rather than simply dying - but they're still distinctly separate 
from exceptions and shouldn't generally be caught. I suppose that it's kind 
of like the difference between checked and unchecked exceptions in Java. You 
generally only worry about the checked ones.

You are right though in that the fact that Errors are Throwable does muddle 
things somewhat.

- Jonathan M Davis

Jun 16 2010

=?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:

bearophile wrote:
 I have counted about 200 usages of std.contracts.enforce() inside Phobos. Can
you tell me what's the purpose of enforce() in a language that has built-in
Contract Programming?

I can see two benefits:

1) enforce throws object.Exception, which is "the root of the exception 
hierarchy"; hence enforce errors can be caught with the same general 
catch(Exception) clause [*].

On the other hand, assert throws a type that is out of the Exception 
hierarchy: core.exception.AssertError

2) As a bonus, the word 'enforce' fits the purpose better than 'assert'

3) (the other 2 :p) The format of the message of the uncaught exceptions 
is a little better (e.g. no   sign before the file name)

Ali

* Note: Actually, Throwable is at the top of the exception hierarchy, 
but I've heard before that the top exception class should be taken to be 
Exception; perhaps for user applications?

Jun 16 2010

Walter Bright <newshound1 digitalmars.com> writes:

Ali Çehreli wrote:
 bearophile wrote:
 I have counted about 200 usages of std.contracts.enforce() inside 
 Phobos. Can you tell me what's the purpose of enforce() in a language 
 that has built-in Contract Programming?

 
 I can see two benefits:

The difference is not based on those 3 points, but on what Andrei wrote here. 
Contracts and error checking are completely distinct activities and should not 
be conflated.

Jun 16 2010

Ary Borenszweig <ary esperanto.org.ar> writes:

On 06/16/2010 04:15 PM, Walter Bright wrote:
 Ali Çehreli wrote:
 bearophile wrote:
 I have counted about 200 usages of std.contracts.enforce() inside
 Phobos. Can you tell me what's the purpose of enforce() in a language
 that has built-in Contract Programming?

 I can see two benefits:

 The difference is not based on those 3 points, but on what Andrei wrote
 here. Contracts and error checking are completely distinct activities
 and should not be conflated.

Could you please explain them? There are many people here that don't 
understand the difference between these two concepts (including me). So 
maybe we are too dumb, maybe those concepts are not generally known or 
maybe the explanation is not very well clear in the documentation.

Jun 16 2010

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Wed, 16 Jun 2010 05:28:46 -0400, Ary Borenszweig <ary esperanto.org.ar>  
wrote:

 On 06/16/2010 04:15 PM, Walter Bright wrote:
 Ali Çehreli wrote:
 bearophile wrote:
 I have counted about 200 usages of std.contracts.enforce() inside
 Phobos. Can you tell me what's the purpose of enforce() in a language
 that has built-in Contract Programming?

 I can see two benefits:

 The difference is not based on those 3 points, but on what Andrei wrote
 here. Contracts and error checking are completely distinct activities
 and should not be conflated.

 Could you please explain them? There are many people here that don't  
 understand the difference between these two concepts (including me). So  
 maybe we are too dumb, maybe those concepts are not generally known or  
 maybe the explanation is not very well clear in the documentation.

I think of enforce as a convenient way translating an error in an  
expectation to an exception in a single expression.

For example, take some system call that returns -1 on error, you could do  
this:

if(result < 0)
    throw new Exception("oops!");

or you could do this:

enforce(result >= 0, "oops!");

Think of enforce as "throw if"

And in fact, I think there's an errnoEnforce which throws a standard  
exception with the string error from the system.

I'd say the difference between enforce and assert is exactly what Andrei  
said -- enforce is meant to catch errors that can occur during normal  
operation.  Assert is meant to catch errors that are not expected during  
normal operation.  Assert's more like a sanity check.  Also, assert is  
turned off in release mode, enforce is left on.

-Steve

Jun 16 2010

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Wed, 16 Jun 2010 06:55:21 -0400, Steven Schveighoffer wrote:

 On Wed, 16 Jun 2010 05:28:46 -0400, Ary Borenszweig
 <ary esperanto.org.ar> wrote:
 
 On 06/16/2010 04:15 PM, Walter Bright wrote:
 Ali Çehreli wrote:
 bearophile wrote:
 I have counted about 200 usages of std.contracts.enforce() inside
 Phobos. Can you tell me what's the purpose of enforce() in a
 language that has built-in Contract Programming?

 I can see two benefits:

 The difference is not based on those 3 points, but on what Andrei
 wrote here. Contracts and error checking are completely distinct
 activities and should not be conflated.

 Could you please explain them? There are many people here that don't
 understand the difference between these two concepts (including me). So
 maybe we are too dumb, maybe those concepts are not generally known or
 maybe the explanation is not very well clear in the documentation.

 
 I think of enforce as a convenient way translating an error in an
 expectation to an exception in a single expression.
 
 For example, take some system call that returns -1 on error, you could
 do this:
 
 if(result < 0)
     throw new Exception("oops!");
 
 or you could do this:
 
 enforce(result >= 0, "oops!");
 
 Think of enforce as "throw if"

It also adds a file and a line number to the error message, so the 
problem is easier to track down.  Very handy. :)


 And in fact, I think there's an errnoEnforce which throws a standard
 exception with the string error from the system.

That's right, and there's even an enforceEx() which lets you specify 
which exception type to throw:

http://digitalmars.com/d/2.0/phobos/std_contracts.html#enforceEx

-Lars

Jun 16 2010

Alex Makhotin <alex bitprox.com> writes:

Steven Schveighoffer wrote:
  
 Think of enforce as "throw if"
 

So why not concatenating the two and rename it to exactly 'throwif'?
Self descriptive is better than cryptic 'enforce'.


-- 
Alex Makhotin,
the founder of BITPROX,
http://bitprox.com

Jun 16 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Alex Makhotin wrote:
 Steven Schveighoffer wrote:
  
 Think of enforce as "throw if"

 
 So why not concatenating the two and rename it to exactly 'throwif'?
 Self descriptive is better than cryptic 'enforce'.

Well throwif describes mechanism and enforce describes intent. After all 
assert is not abortif :o).

Andrei

Jun 16 2010

Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:

On 16/06/2010 15:45, Andrei Alexandrescu wrote:
 Well throwif describes mechanism and enforce describes intent. After all
 assert is not abortif :o).

 Andrei

Indeed, especially given that other code program may use throwing as 
mechanism, but with a different intent. And you would not be able to 
distinguish it if you called it throwif.


-- 
Bruno Medeiros - Software Engineer

Jun 17 2010

Leandro Lucarella <luca llucax.com.ar> writes:

Steven Schveighoffer, el 16 de junio a las 06:55 me escribiste:
 On Wed, 16 Jun 2010 05:28:46 -0400, Ary Borenszweig
 <ary esperanto.org.ar> wrote:
 
On 06/16/2010 04:15 PM, Walter Bright wrote:
Ali Çehreli wrote:
bearophile wrote:
I have counted about 200 usages of std.contracts.enforce() inside
Phobos. Can you tell me what's the purpose of enforce() in a language
that has built-in Contract Programming?

I can see two benefits:

The difference is not based on those 3 points, but on what Andrei wrote
here. Contracts and error checking are completely distinct activities
and should not be conflated.

Could you please explain them? There are many people here that
don't understand the difference between these two concepts
(including me). So maybe we are too dumb, maybe those concepts are
not generally known or maybe the explanation is not very well
clear in the documentation.

 
 I think of enforce as a convenient way translating an error in an
 expectation to an exception in a single expression.
 
 For example, take some system call that returns -1 on error, you
 could do this:
 
 if(result < 0)
    throw new Exception("oops!");
 
 or you could do this:
 
 enforce(result >= 0, "oops!");
 
 Think of enforce as "throw if"

So maybe throw_if() would be a better name =)

Anyway, I think enforce() is poisson, because it make the programmer to
not think about errors at all, just add and enforce() and there you go.
But when you need to be fault tolerant, is very important to know what's
the nature of the error, but thanks to enforce(), almost every error is
a plain Exception, no hierarchy, no extra info, all you can do to get
a little more info about what happened is to parse the exception string,
and that's not really an option.

 And in fact, I think there's an errnoEnforce which throws a standard
 exception with the string error from the system.

That's the only useful case of enforce, because it includes the
*important* information (the actual errno).

There is also enforceEx!(), to use a custom exception, which practically
nobody uses (I counted only 4 uses in phobos).

-- 
Leandro Lucarella (AKA luca)                     http://llucax.com.ar/
----------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------
Y será el día en que la electricidad deje de ser rayo y sea depilador
femenino.
	-- Ricardo Vaporeso

Jun 16 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Leandro Lucarella wrote:
 Steven Schveighoffer, el 16 de junio a las 06:55 me escribiste:
 On Wed, 16 Jun 2010 05:28:46 -0400, Ary Borenszweig
 <ary esperanto.org.ar> wrote:

 On 06/16/2010 04:15 PM, Walter Bright wrote:
 Ali Çehreli wrote:
 bearophile wrote:
 I have counted about 200 usages of std.contracts.enforce() inside
 Phobos. Can you tell me what's the purpose of enforce() in a language
 that has built-in Contract Programming?

 I can see two benefits:

 The difference is not based on those 3 points, but on what Andrei wrote
 here. Contracts and error checking are completely distinct activities
 and should not be conflated.

 Could you please explain them? There are many people here that
 don't understand the difference between these two concepts
 (including me). So maybe we are too dumb, maybe those concepts are
 not generally known or maybe the explanation is not very well
 clear in the documentation.

 I think of enforce as a convenient way translating an error in an
 expectation to an exception in a single expression.

 For example, take some system call that returns -1 on error, you
 could do this:

 if(result < 0)
    throw new Exception("oops!");

 or you could do this:

 enforce(result >= 0, "oops!");

 Think of enforce as "throw if"

 
 So maybe throw_if() would be a better name =)
 
 Anyway, I think enforce() is poisson,

Indeed it is a bit fishy :o).

 because it make the programmer to
 not think about errors at all, just add and enforce() and there you go.
 But when you need to be fault tolerant, is very important to know what's
 the nature of the error, but thanks to enforce(), almost every error is
 a plain Exception, no hierarchy, no extra info, all you can do to get
 a little more info about what happened is to parse the exception string,
 and that's not really an option.

I think there is no real need for exception hierarchies. I occasionally 
dream of eliminating all of the useless exceptions defined left and 
right in Phobos.

 And in fact, I think there's an errnoEnforce which throws a standard
 exception with the string error from the system.

 
 That's the only useful case of enforce, because it includes the
 *important* information (the actual errno).
 
 There is also enforceEx!(), to use a custom exception, which practically
 nobody uses (I counted only 4 uses in phobos).

I'd be hard pressed to find good examples of exception hierarchy use. 
Everybody talks about them but I've seen none.

The fact that the coder doesn't need to think hard to use enforce() 
effectively is a plus, not a minus. An overdesigned enforce that adds 
extra burden to its user would have been a mistake.


Andrei

Jun 16 2010

dsimcha <dsimcha yahoo.com> writes:

== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 Everybody talks about them but I've seen none.
 The fact that the coder doesn't need to think hard to use enforce()
 effectively is a plus, not a minus. An overdesigned enforce that adds
 extra burden to its user would have been a mistake.
 Andrei

IMHO the presence of a simple method of handling errors, even if it's far from
perfect, is a good thing.  If you have to think about a whole exception
hierarchy
every time you hit a possible error condition in your code, you tend to put this
tedious task off until forever, leading to programs that fail for unknown
reasons
because some error condition was never reported.  Well-designed exception
hierarchies are nice, but forcing their use all the time would be making the
perfect the enemy of the good.

Furthermore, I love enforce() because sometimes I want just some subset of
assertions checked in release mode, usually whichever ones can be checked at
negligible performance cost.  I tend to use it a lot as an
assert-even-in-release-mode function.

Jun 16 2010

Michel Fortin <michel.fortin michelf.com> writes:

On 2010-06-16 10:53:12 -0400, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 Leandro Lucarella wrote:
 Steven Schveighoffer, el 16 de junio a las 06:55 me escribiste:
 On Wed, 16 Jun 2010 05:28:46 -0400, Ary Borenszweig
 <ary esperanto.org.ar> wrote:
 
 On 06/16/2010 04:15 PM, Walter Bright wrote:
 Ali �ehreli wrote:
 bearophile wrote:
 I have counted about 200 usages of std.contracts.enforce() inside
 Phobos. Can you tell me what's the purpose of enforce() in a language
 that has built-in Contract Programming?

 I can see two benefits:

 The difference is not based on those 3 points, but on what Andrei wrote
 here. Contracts and error checking are completely distinct activities
 and should not be conflated.

 Could you please explain them? There are many people here that
 don't understand the difference between these two concepts
 (including me). So maybe we are too dumb, maybe those concepts are
 not generally known or maybe the explanation is not very well
 clear in the documentation.

 I think of enforce as a convenient way translating an error in an
 expectation to an exception in a single expression.
 
 For example, take some system call that returns -1 on error, you
 could do this:
 
 if(result < 0)
    throw new Exception("oops!");
 
 or you could do this:
 
 enforce(result >= 0, "oops!");
 
 Think of enforce as "throw if"

 
 So maybe throw_if() would be a better name =)
 
 Anyway, I think enforce() is poisson,

 
 Indeed it is a bit fishy :o).
 
 because it make the programmer to
 not think about errors at all, just add and enforce() and there you go.
 But when you need to be fault tolerant, is very important to know what's
 the nature of the error, but thanks to enforce(), almost every error is
 a plain Exception, no hierarchy, no extra info, all you can do to get
 a little more info about what happened is to parse the exception string,
 and that's not really an option.

 
 I think there is no real need for exception hierarchies. I occasionally 
 dream of eliminating all of the useless exceptions defined left and 
 right in Phobos.

The need is not really for a hierarchy. The hierarchy serves the need, 
which is to:

1. Be able to programatically check the kind of the error and so your 
program can act appropriately.
2. Propagate additional information related to the error and the 
context it occured.

Displaying a proper error message to a user and offering relevant 
recovery choices often need both. Sometime, a program won't ask the 
user and attempt something by itself as a recovery. In both cases, you 
need to know the kind of error, and may need context information.


 And in fact, I think there's an errnoEnforce which throws a standard
 exception with the string error from the system.

 
 That's the only useful case of enforce, because it includes the
 *important* information (the actual errno).
 
 There is also enforceEx!(), to use a custom exception, which practically
 nobody uses (I counted only 4 uses in phobos).

 
 I'd be hard pressed to find good examples of exception hierarchy use. 
 Everybody talks about them but I've seen none.

The need is not really for a hierarchy. The hierarchy serves the need, 
which is to:

1. Programatically check the kind of the error and so your program can 
act appropriately.
2. Propagate additional information related to the error and the 
context in which it occurred.

Displaying a proper error message to a user and offering relevant 
recovery choices often need both. Sometime, a program won't ask the 
user and attempt something by itself as a recovery. In both cases, you 
need to know the kind of error, and may need context information.

That said, hierarchies are often abused, and aren't universally useful. 
But exceptions should provide the above information in a way or another 
when useful.

Think about a GUI program, if an exception is thrown somewhere during a 
complex operation (say, reading a lot of files), I could catch it as 
some level, create a wrapper exception with the context (file=hello.d 
error=access denied) and rethrow it to unwind until reatching the GUI 
error handler. Or the file function could throw a useful exception from 
the start. In either cases, the code in charge of that operation can 
display a message such as "Creating the archive failed. File 'hello.d' 
could not be read because you do not have read permissions to it." with 
options "Retry as Administrator", "Exclude 'hello.d'" or "Cancel". 
Knowing programatically what has gone wrong is important in many cases.


 The fact that the coder doesn't need to think hard to use enforce() 
 effectively is a plus, not a minus. An overdesigned enforce that adds 
 extra burden to its user would have been a mistake.

That's indeed true. Throwing an Exception with no info is still better 
than not throwing at all, and creating useful exceptions isn't always 
easy, nor economically rewarding. What's important is to make it easy 
to improve the thrown exception when it becomes relevant. For instance

	// first version: throws standard exception
	enforce(1 == 1, "access denied to " ~ filename);

	// refined version: throws custom exception
	enforce(1 == 1, new FileException("access denied to " ~ filename, 
accessDeniedError, filename));


-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Jun 16 2010

Leandro Lucarella <luca llucax.com.ar> writes:

Andrei Alexandrescu, el 16 de junio a las 07:53 me escribiste:
because it make the programmer to
not think about errors at all, just add and enforce() and there you go.
But when you need to be fault tolerant, is very important to know what's
the nature of the error, but thanks to enforce(), almost every error is
a plain Exception, no hierarchy, no extra info, all you can do to get
a little more info about what happened is to parse the exception string,
and that's not really an option.

 
 I think there is no real need for exception hierarchies. I
 occasionally dream of eliminating all of the useless exceptions
 defined left and right in Phobos.

Exception hierarchy is only one way to discriminate error types. Extra
info, is another (like an error code). I agree that a *large* exception
hierarchy hurts more than it helps.

And in fact, I think there's an errnoEnforce which throws a standard
exception with the string error from the system.

That's the only useful case of enforce, because it includes the
*important* information (the actual errno).

There is also enforceEx!(), to use a custom exception, which practically
nobody uses (I counted only 4 uses in phobos).

 
 I'd be hard pressed to find good examples of exception hierarchy
 use. Everybody talks about them but I've seen none.

I think Python has a good one. I find myself discriminating between
ValueError, IndexError, KeyError, OSError and IOError all the time.

 The fact that the coder doesn't need to think hard to use enforce()
 effectively is a plus, not a minus. An overdesigned enforce that
 adds extra burden to its user would have been a mistake.

That is, if you don't care on handling errors and let the program crash
with a backtrace, or add a big try {} catch (Exception) in the main. If
that's not the case, it only produce a false feeling that D (standard
library) is good handling errors when it's not, it's just a binary
"there is an error" - "there is no errors".

-- 
Leandro Lucarella (AKA luca)                     http://llucax.com.ar/
----------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------
The average person laughs 13 times a day

Jun 16 2010

Jonathan M Davis <jmdavisProg gmail.com> writes:

Andrei Alexandrescu wrote:

 Anyway, I think enforce() is poisson,

 
 Indeed it is a bit fishy :o).

LOL. My thoughts exactly.
 
 because it make the programmer to
 not think about errors at all, just add and enforce() and there you go.
 But when you need to be fault tolerant, is very important to know what's
 the nature of the error, but thanks to enforce(), almost every error is
 a plain Exception, no hierarchy, no extra info, all you can do to get
 a little more info about what happened is to parse the exception string,
 and that's not really an option.

 
 I think there is no real need for exception hierarchies. I occasionally
 dream of eliminating all of the useless exceptions defined left and
 right in Phobos.
 
 And in fact, I think there's an errnoEnforce which throws a standard
 exception with the string error from the system.

 
 That's the only useful case of enforce, because it includes the
 *important* information (the actual errno).
 
 There is also enforceEx!(), to use a custom exception, which practically
 nobody uses (I counted only 4 uses in phobos).

 
 I'd be hard pressed to find good examples of exception hierarchy use.
 Everybody talks about them but I've seen none.
 
 The fact that the coder doesn't need to think hard to use enforce()
 effectively is a plus, not a minus. An overdesigned enforce that adds
 extra burden to its user would have been a mistake.
 
 
 Andrei

I think that exception hierarchies can be quite useful, but in most cases, I 
haven't seen projects bother with them. I do think that certain types of 
exceptions can be useful as separate types as long as they inherit from the 
base exception type and you therefore don't _have_ to worry about the 
hierarchy.

A good example of a useful exception type IMO is Java's IOException. It 
makes good sense to handle them in a specific way separate from general 
exceptions. You can frequently recover just fine from them, and it allows 
you to handle I/O-related exceptions gracefully while other exceptions might 
be considered fatal. However, those other exceptions - especially those 
which are from more or less unrecoverable errors NullPointerExceptions or 
OutOfMemoryExceptions - don't necessarily gain much from an exception 
hierarchy. So, I think that it really depends on the exception.

I do think, however, that there are certain types of exceptions which can 
benefit from having their own type because it allows you to handle them in a 
specific manner separate from general and/or unrecoverable exceptions.

- Jonathan M Davis

Jun 16 2010

Jason Spencer <spencer8 sbcglobal.net> writes:

I think about it roughly this way (in reverse priority):

Contracts/assertions concern problems in the program(ming) domain.

Exceptions concern problems in the system domain.

Problems in the actual problem domain should be modeled in the design
and have their own abstractions.

These interact a little bit, so I have an excuse to bend my rules
whenever I want :)  For instance, if the system is part of your
problem domain (e.g. embedded code), then exceptions are probably not
the right approach.  That's why I indicate a false idea of priority.

Jason

Jun 16 2010

Walter Bright <newshound1 digitalmars.com> writes:

Ary Borenszweig wrote:
 On 06/16/2010 04:15 PM, Walter Bright wrote:
 The difference is not based on those 3 points, but on what Andrei wrote
 here. Contracts and error checking are completely distinct activities
 and should not be conflated.

 
 Could you please explain them? There are many people here that don't 
 understand the difference between these two concepts (including me). So 
 maybe we are too dumb, maybe those concepts are not generally known or 
 maybe the explanation is not very well clear in the documentation.

It has nothing to do with being dumb, as it is not obvious.

Contracts are for verifying that your program is in a state that it is designed 
to be in. A contract failure is defined as a program bug.

Errors, on the other hand, are things that can go wrong at run time, like your 
disk is full when trying to write a file. These are NOT program bugs.

Another way to look at it is your program should continue to operate correctly 
if all the contracts are removed. This is not true of removing all error 
checking and handling.

Furthermore, errors are something a program can recover from and continue 
operating. Contract failures are ALWAYS fatal. A common newbie (and some
expert) 
misconception is that contract failures can or even must be recovered. This 
comes from a misunderstanding of the basic principles of engineering a safe and 
reliable system.

Jun 16 2010

Lutger <lutger.blijdestijn gmail.com> writes:

Walter Bright wrote:

 Ary Borenszweig wrote:
 On 06/16/2010 04:15 PM, Walter Bright wrote:
 The difference is not based on those 3 points, but on what Andrei wrote
 here. Contracts and error checking are completely distinct activities
 and should not be conflated.

 
 Could you please explain them? There are many people here that don't
 understand the difference between these two concepts (including me). So
 maybe we are too dumb, maybe those concepts are not generally known or
 maybe the explanation is not very well clear in the documentation.

 
 It has nothing to do with being dumb, as it is not obvious.
 
 Contracts are for verifying that your program is in a state that it is
 designed to be in. A contract failure is defined as a program bug.
 
 Errors, on the other hand, are things that can go wrong at run time, like your
 disk is full when trying to write a file. These are NOT program bugs.
 
 Another way to look at it is your program should continue to operate correctly
 if all the contracts are removed. This is not true of removing all error
 checking and handling.
 
 Furthermore, errors are something a program can recover from and continue
 operating. Contract failures are ALWAYS fatal. A common newbie (and some
 expert) misconception is that contract failures can or even must be recovered.
 This comes from a misunderstanding of the basic principles of engineering a
 safe and reliable system.

I am not so sure about this last point, usually you want to fail but perhaps
not 
always. This is about what to do after detection of a program bug vs how to 
handle an exceptional condition. 

Suppose for example (actually this is from real life) there is an important 
operation which, as a service, also sends an e-mail notification as part of
that 
operation. It is very bad if the operation fails, but a failed notification is 
not that bad. What to do in case of a bug with the e-mail notification?

1. crash (gracefully), do not complete the operation.
2. log the error for the devs to look into (or crash) *after* the operation is 
complete, let the operation go through without the e-mail notification.

Option 1 is annoying and prevents people from getting work done due to a
'minor' 
bug. Option 2 however probably results in this bug either not getting noticed 
quite early enough or ignored in the face of other issues that always seems to 
have higher priority. Choosing for option 2 can also lead to bugs being 
swallowed silently or mistaken for exceptional conditions, which is more 
dangerous.

I don't mean to split hairs, I bet a lot of software has these kind of cases.

Jun 16 2010

"Simen kjaeraas" <simen.kjaras gmail.com> writes:

Lutger <lutger.blijdestijn gmail.com> wrote:

 Suppose for example (actually this is from real life) there is an  
 important
 operation which, as a service, also sends an e-mail notification as part  
 of that
 operation. It is very bad if the operation fails, but a failed  
 notification is
 not that bad. What to do in case of a bug with the e-mail notification?

 1. crash (gracefully), do not complete the operation.
 2. log the error for the devs to look into (or crash) *after* the  
 operation is
 complete, let the operation go through without the e-mail notification.

 Option 1 is annoying and prevents people from getting work done due to a  
 'minor'
 bug. Option 2 however probably results in this bug either not getting  
 noticed
 quite early enough or ignored in the face of other issues that always  
 seems to
 have higher priority. Choosing for option 2 can also lead to bugs being
 swallowed silently or mistaken for exceptional conditions, which is more
 dangerous.

 I don't mean to split hairs, I bet a lot of software has these kind of  
 cases.

How did you end up with an email system that is so horribly broken that
it spits Errors instead of Exceptions when things are not quite the way
it wants them to be?

If it cannot send the email, it may throw an Exception. If you try and
pass it a handwritten letter, it should throw an Error.

Basically, throwing an Exception means 'Your attention please, reactor 5
has a cooling problem you might want to look at', whereas an Error means
'Explosion imminent, get the fuck off outta here!'.

-- 
Simen

Jun 16 2010

Lutger <lutger.blijdestijn gmail.com> writes:

Simen kjaeraas wrote:

 Lutger <lutger.blijdestijn gmail.com> wrote:
 
 Suppose for example (actually this is from real life) there is an
 important
 operation which, as a service, also sends an e-mail notification as part
 of that
 operation. It is very bad if the operation fails, but a failed
 notification is
 not that bad. What to do in case of a bug with the e-mail notification?

 1. crash (gracefully), do not complete the operation.
 2. log the error for the devs to look into (or crash) *after* the
 operation is
 complete, let the operation go through without the e-mail notification.

 Option 1 is annoying and prevents people from getting work done due to a
 'minor'
 bug. Option 2 however probably results in this bug either not getting
 noticed
 quite early enough or ignored in the face of other issues that always
 seems to
 have higher priority. Choosing for option 2 can also lead to bugs being
 swallowed silently or mistaken for exceptional conditions, which is more
 dangerous.

 I don't mean to split hairs, I bet a lot of software has these kind of
 cases.

 
 How did you end up with an email system that is so horribly broken that
 it spits Errors instead of Exceptions when things are not quite the way
 it wants them to be?

Not Errors, it is not in D and does not distinguish between Errors and 
Exceptions. It was an example, a (design) question. It's very simple:

sendEmail() 
// possibly die here because something relatively unimportant thing is buggy

vs:

try
{
    sendEmail()
}
catch(BadShitThatCanHappen)
{
    RecoverFromBadShitThatCanHappen() // ok, this is good and according to spec
}
catch(Exception ex)
{
    logError() 
    // now crash? assume we know this must be programmer's fault
}

 If it cannot send the email, it may throw an Exception. If you try and
 pass it a handwritten letter, it should throw an Error.
 
 Basically, throwing an Exception means 'Your attention please, reactor 5
 has a cooling problem you might want to look at', whereas an Error means
 'Explosion imminent, get the fuck off outta here!'.
 

No, an Error means the program has a bug. Programs have thousands of bugs, this 
is not related to how critical it is. An Exception can be way more important to 
fix than a bug. WebServerDownException for example, is often not a bug in the 
code that drives websites, but for sure I will contact the sysadmin before even 
thinking of going back to work. The question is how to proceed after the fact.

Jun 16 2010

"Simen kjaeraas" <simen.kjaras gmail.com> writes:

Lutger <lutger.blijdestijn gmail.com> wrote:

 How did you end up with an email system that is so horribly broken th=


at
 it spits Errors instead of Exceptions when things are not quite the w=


ay
 it wants them to be?

 Not Errors, it is not in D and does not distinguish between Errors and=

 Exceptions. It was an example, a (design) question. It's very simple:

Ah. Then no, handling the exception is perfectly acceptable, and probabl=
y
the right thing to do.

 Basically, throwing an Exception means 'Your attention please, reacto=


r 5
 has a cooling problem you might want to look at', whereas an Error me=


ans
 'Explosion imminent, get the fuck off outta here!'.

 No, an Error means the program has a bug. Programs have thousands of  =

 bugs, this
 is not related to how critical it is. An Exception can be way more  =

 important to
 fix than a bug. WebServerDownException for example, is often not a bug=

  =

 in the
 code that drives websites, but for sure I will contact the sysadmin  =

 before even
 thinking of going back to work. The question is how to proceed after t=

he  =

 fact.

Yes and no. Throwing an error puts the program into an undefined state,
and everything may happen. Because everything happening at once would
strain space-time and cause Bad Things=E2=84=A2 to happen, we would like=
 to limit
the time in which this does occur. Hence, we bail out.

That said, a bug in a rarely-used function may indeed be significantly
less important than getting a server back online. However, I would still=

say an error indicates something is fundamentally wrong.

-- =

Simen

Jun 17 2010

Walter Bright <newshound1 digitalmars.com> writes:

Simen kjaeraas wrote:
 That said, a bug in a rarely-used function may indeed be significantly
 less important than getting a server back online. However, I would still
 say an error indicates something is fundamentally wrong.

The contract failing means you do not know what went wrong. That means there's 
no way the program can determine if it is recoverable or not. For all you know, 
malware may have infected your process and continuing to execute may send your 
credit card database to a thief.

Jun 17 2010

Lutger <lutger.blijdestijn gmail.com> writes:

Simen kjaeraas wrote:

...
 If it cannot send the email, it may throw an Exception. If you try and
 pass it a handwritten letter, it should throw an Error.
 

This is the question: should I segfault on a handwritten letter even if it is 
not such an important letter and could just go on operating?

Jun 16 2010

"Simen kjaeraas" <simen.kjaras gmail.com> writes:

Lutger <lutger.blijdestijn gmail.com> wrote:

 Simen kjaeraas wrote:

 ...
 If it cannot send the email, it may throw an Exception. If you try and
 pass it a handwritten letter, it should throw an Error.

 This is the question: should I segfault on a handwritten letter even if  
 it is
 not such an important letter and could just go on operating?

Yes. If someone is passing your email system a handwritten letter,
something is so wrong, the program should balk and exit. It's not just
a small mixup, it's an indication something is completely wrong.

As Walter put it, an Error, be it an AssertError or otherwise, means
your program has ventured into uncharted territory, and behavior from
this point on is undefined. "Permissible undefined behavior ranges
 from ignoring the situation completely with unpredictable results, to
having demons fly out of your nose."[1]


-- 
Simen

[1] http://groups.google.com/group/comp.std.c/msg/dfe1ef367547684b?pli=1

Jun 17 2010

=?UTF-8?B?IkrDqXLDtG1lIE0uIEJlcmdlciI=?= <jeberger free.fr> writes:

Simen kjaeraas wrote:
 Lutger <lutger.blijdestijn gmail.com> wrote:
=20
 Simen kjaeraas wrote:

 ...
 If it cannot send the email, it may throw an Exception. If you try an=



d
 pass it a handwritten letter, it should throw an Error.

 This is the question: should I segfault on a handwritten letter even
 if it is
 not such an important letter and could just go on operating?

=20
 Yes. If someone is passing your email system a handwritten letter,
 something is so wrong, the program should balk and exit. It's not just
 a small mixup, it's an indication something is completely wrong.
=20

	Bad example. If someone is passing bad input to your program, it
should signal the mistake and recover. External input must *always*
be checked and wrong inputs must be recovered from gracefully.

	However, if you take (and check) the user input, then put it in a
queue, then take things from the queue for processing, and you get a
handwritten letter out of the queue, *then* it is an error and
cannot be recovered from (because this should have been checked for
before putting the letter in the queue and if this is messed up, you
don't know what else may be messed up nor how bad the situation is).
Which is what you say after:

 As Walter put it, an Error, be it an AssertError or otherwise, means
 your program has ventured into uncharted territory, and behavior from
 this point on is undefined. "Permissible undefined behavior ranges
 from ignoring the situation completely with unpredictable results, to
 having demons fly out of your nose."[1]
=20

		Jerome
--=20
mailto:jeberger free.fr
http://jeberger.free.fr
Jabber: jeberger jabber.fr

Jun 17 2010

Walter Bright <newshound1 digitalmars.com> writes:

Lutger wrote:
 Walter Bright wrote:
 Furthermore, errors are something a program can recover from and continue
 operating. Contract failures are ALWAYS fatal. A common newbie (and some
 expert) misconception is that contract failures can or even must be recovered.
 This comes from a misunderstanding of the basic principles of engineering a
 safe and reliable system.

 
 I am not so sure about this last point, usually you want to fail but perhaps
not 
 always. This is about what to do after detection of a program bug vs how to 
 handle an exceptional condition. 

First you need to decide if it is a program bug or not. If it is not a program 
bug, it shouldn't be done with contracts.

If it is a program bug, then the only proper thing to do is exit the program. 
The program cannot decide if it is a minor bug or not, nor can it decide if it 
is recoverable. It is, by definition, in an unknown state, and continuing to 
execute may cause anything to happen. (For example, malware may have installed 
itself and that may get executed.)

If you need notifications that the program failed, a separate monitor program 
should be used. This is how people who design safe systems do it. People who 
believe that programs can "recover" from bugs design systems that fail, 
sometimes with terrible consequences.

My articles on the topic:

http://www.drdobbs.com/blog/archives/2009/10/safe_systems_fr.html

http://www.drdobbs.com/blog/archives/2009/11/designing_safe.html

Jun 16 2010

Lutger <lutger.blijdestijn gmail.com> writes:

Walter Bright wrote:

 Lutger wrote:
 Walter Bright wrote:
 Furthermore, errors are something a program can recover from and continue
 operating. Contract failures are ALWAYS fatal. A common newbie (and some
 expert) misconception is that contract failures can or even must be
 recovered. This comes from a misunderstanding of the basic principles of
 engineering a safe and reliable system.

 
 I am not so sure about this last point, usually you want to fail but perhaps
 not always. This is about what to do after detection of a program bug vs how
 to handle an exceptional condition.

 
 First you need to decide if it is a program bug or not. If it is not a program
 bug, it shouldn't be done with contracts.
 
 If it is a program bug, then the only proper thing to do is exit the program.
 The program cannot decide if it is a minor bug or not, nor can it decide if it
 is recoverable. It is, by definition, in an unknown state, and continuing to
 execute may cause anything to happen. (For example, malware may have installed
 itself and that may get executed.)

I didn't really get this point from your articles on the subject, but that does 
clarify it for me. The assumption one makes when recovering is that it is
indeed 
possible and safe. Even if it may be likely, it is never reliable to count on 
it.

Jun 16 2010

Walter Bright <newshound1 digitalmars.com> writes:

Lutger wrote:
 The assumption one makes when recovering is that it is indeed 
 possible and safe. Even if it may be likely, it is never reliable to count on 
 it.

Exactly.

Jun 16 2010

Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:

On 17/06/2010 00:27, Walter Bright wrote:
 Lutger wrote:
 Walter Bright wrote:
 Furthermore, errors are something a program can recover from and
 continue
 operating. Contract failures are ALWAYS fatal. A common newbie (and some
 expert) misconception is that contract failures can or even must be
 recovered.
 This comes from a misunderstanding of the basic principles of
 engineering a
 safe and reliable system.

 I am not so sure about this last point, usually you want to fail but
 perhaps not always. This is about what to do after detection of a
 program bug vs how to handle an exceptional condition.

 First you need to decide if it is a program bug or not. If it is not a
 program bug, it shouldn't be done with contracts.

I would go further and state that anything outside the direct control of 
a process (such as network state, disk state, OS state, other processes 
behavior, user interaction, etc.) should be modeled as an error and not 
a contract violation.
Such externals errors may be a "bug" in the system as whole, but they 
are not a bug in the particular process, and thus should not be modeled 
as a contract violation.
In other words, _contract violations should always be situations that 
you can prevent by changing the code of the underlying process_. You 
can't do that for network errors, disk state, etc.. But you can do that 
for stuff like ensuring a variable is never null, an object in your 
program is in some particular state at a particular point in execution, etc.

-- 
Bruno Medeiros - Software Engineer

Jun 17 2010

Walter Bright <newshound1 digitalmars.com> writes:

Bruno Medeiros wrote:
 I would go further and state that anything outside the direct control of 
 a process (such as network state, disk state, OS state, other processes 
 behavior, user interaction, etc.) should be modeled as an error and not 
 a contract violation.
 Such externals errors may be a "bug" in the system as whole, but they 
 are not a bug in the particular process, and thus should not be modeled 
 as a contract violation.
 In other words, _contract violations should always be situations that 
 you can prevent by changing the code of the underlying process_. You 
 can't do that for network errors, disk state, etc.. But you can do that 
 for stuff like ensuring a variable is never null, an object in your 
 program is in some particular state at a particular point in execution, 
 etc.

That's a reasonable way of looking at it.

Jun 17 2010

Sean Kelly <sean invisibleduck.org> writes:

Bruno Medeiros <brunodomedeiros+spam com.gmail> wrote:
 On 17/06/2010 00:27, Walter Bright wrote:
 Lutger wrote:
 Walter Bright wrote:
 Furthermore, errors are something a program can recover from and
 continue
 operating. Contract failures are ALWAYS fatal. A common newbie (and
 some
 expert) misconception is that contract failures can or even must be
 recovered.
 This comes from a misunderstanding of the basic principles of
 engineering a
 safe and reliable system.

 
 I am not so sure about this last point, usually you want to fail but
 perhaps not always. This is about what to do after detection of a
 program bug vs how to handle an exceptional condition.

 
 First you need to decide if it is a program bug or not. If it is not
 a
 program bug, it shouldn't be done with contracts.
 

 
 I would go further and state that anything outside the direct control
 of a process (such as network state, disk state, OS state, other
 processes behavior, user interaction, etc.) should be modeled as an
 error and not a contract violation.
 Such externals errors may be a "bug" in the system as whole, but they
 are not a bug in the particular process, and thus should not be
 modeled as a contract violation.
 In other words, _contract violations should always be situations that
 you can prevent by changing the code of the underlying process_. You
 can't do that for network errors, disk state, etc.. But you can do
 that for stuff like ensuring a variable is never null, an object in
 your program is in some particular state at a particular point in
 execution, etc.

Right. I'd say contracts are to catch logic errors.

Jun 18 2010

Ary Borenszweig <ary esperanto.org.ar> writes:

On 06/16/2010 11:44 PM, Walter Bright wrote:
 Ary Borenszweig wrote:
 On 06/16/2010 04:15 PM, Walter Bright wrote:
 The difference is not based on those 3 points, but on what Andrei wrote
 here. Contracts and error checking are completely distinct activities
 and should not be conflated.

 Could you please explain them? There are many people here that don't
 understand the difference between these two concepts (including me).
 So maybe we are too dumb, maybe those concepts are not generally known
 or maybe the explanation is not very well clear in the documentation.

 It has nothing to do with being dumb, as it is not obvious.

 Contracts are for verifying that your program is in a state that it is
 designed to be in. A contract failure is defined as a program bug.

 Errors, on the other hand, are things that can go wrong at run time,
 like your disk is full when trying to write a file. These are NOT
 program bugs.

 Another way to look at it is your program should continue to operate
 correctly if all the contracts are removed. This is not true of removing
 all error checking and handling.

 Furthermore, errors are something a program can recover from and
 continue operating. Contract failures are ALWAYS fatal. A common newbie
 (and some expert) misconception is that contract failures can or even
 must be recovered. This comes from a misunderstanding of the basic
 principles of engineering a safe and reliable system.

Ah, ok, now I understand. Thanks.

Jun 16 2010

Michel Fortin <michel.fortin michelf.com> writes:

On 2010-06-16 05:15:24 -0400, Walter Bright <newshound1 digitalmars.com> said:

 The difference is not based on those 3 points, but on what Andrei wrote 
 here. Contracts and error checking are completely distinct activities 
 and should not be conflated.

True.

Yet, enforce is inside std.contracts. If that isn't conflating the two 
concepts I wonder what it is. :-)

-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Jun 16 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Michel Fortin wrote:
 On 2010-06-16 05:15:24 -0400, Walter Bright <newshound1 digitalmars.com> 
 said:
 
 The difference is not based on those 3 points, but on what Andrei 
 wrote here. Contracts and error checking are completely distinct 
 activities and should not be conflated.

 
 True.
 
 Yet, enforce is inside std.contracts. If that isn't conflating the two 
 concepts I wonder what it is. :-)

You're right! I think Lars' suggestion is sensible - we should move 
enforce to object. Better yet we should find a better name for 
std.contracts. Ideas?

Andrei

Jun 16 2010

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Wed, 16 Jun 2010 07:31:39 -0700, Andrei Alexandrescu wrote:

 Michel Fortin wrote:
 On 2010-06-16 05:15:24 -0400, Walter Bright
 <newshound1 digitalmars.com> said:
 
 The difference is not based on those 3 points, but on what Andrei
 wrote here. Contracts and error checking are completely distinct
 activities and should not be conflated.

 
 True.
 
 Yet, enforce is inside std.contracts. If that isn't conflating the two
 concepts I wonder what it is. :-)

 
 You're right! I think Lars' suggestion is sensible - we should move
 enforce to object. Better yet we should find a better name for
 std.contracts. Ideas?
 
 Andrei


A few suggestions (even though I still think it belongs in object.d), in 
no particular order:

std.enforce
std.assumptions
std.constraints
std.checks
std.tests
std.error
std.errcheck

-Lars

Jun 17 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 06/17/2010 04:10 AM, Lars T. Kyllingstad wrote:
 On Wed, 16 Jun 2010 07:31:39 -0700, Andrei Alexandrescu wrote:

 Michel Fortin wrote:
 On 2010-06-16 05:15:24 -0400, Walter Bright
 <newshound1 digitalmars.com>  said:

 The difference is not based on those 3 points, but on what Andrei
 wrote here. Contracts and error checking are completely distinct
 activities and should not be conflated.

 True.

 Yet, enforce is inside std.contracts. If that isn't conflating the two
 concepts I wonder what it is. :-)

 You're right! I think Lars' suggestion is sensible - we should move
 enforce to object. Better yet we should find a better name for
 std.contracts. Ideas?

 Andrei


 A few suggestions (even though I still think it belongs in object.d), in
 no particular order:

 std.enforce
 std.assumptions
 std.constraints
 std.checks
 std.tests
 std.error
 std.errcheck

 -Lars

We haven't reached consensus on where to put enforce() and friends. Any 
other ideas? Of the above, I like std.checks.

Better yet, how about defining std.exception that includes a host of 
exception-related functionality (such as defining exceptions that retain 
file and line, perhaps stack traces etc.)?


Andrei

Jun 27 2010

"Simen kjaeraas" <simen.kjaras gmail.com> writes:

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:

 Better yet, how about defining std.exception that includes a host of  
 exception-related functionality (such as defining exceptions that retain  
 file and line, perhaps stack traces etc.)?

Sounds good.

-- 
Simen

Jun 27 2010

Jonathan M Davis <jmdavisprog gmail.com> writes:

On Sunday 27 June 2010 16:09:02 Andrei Alexandrescu wrote:
 
 We haven't reached consensus on where to put enforce() and friends. Any
 other ideas? Of the above, I like std.checks.
 
 Better yet, how about defining std.exception that includes a host of
 exception-related functionality (such as defining exceptions that retain
 file and line, perhaps stack traces etc.)?
 
 
 Andrei

std.exception sounds like a good plan. I'm not overly fond of any of the other 
names, and I'm not sure that I care much one way or the other if we pick one, 
but std.exception with a bunch of exception-related stuff sounds particularly 
useful and could help standardize some of the way exceptions are used in D code.

- Jonathan M Davis

Jun 27 2010

Sean Kelly <sean invisibleduck.org> writes:

Andrei Alexandrescu Wrote:
 We haven't reached consensus on where to put enforce() and friends. Any 
 other ideas? Of the above, I like std.checks.
 
 Better yet, how about defining std.exception that includes a host of 
 exception-related functionality (such as defining exceptions that retain 
 file and line, perhaps stack traces etc.)?

The trace functionality already exists in druntime.  As for exceptions, they
may belong there as well if they're ones the runtime should be aware of.

Jun 28 2010

"Rory McGuire" <rmcguire neonova.co.za> writes:

On Mon, 28 Jun 2010 17:36:15 +0200, Sean Kelly <sean invisibleduck.org>  
wrote:

 Andrei Alexandrescu Wrote:
 We haven't reached consensus on where to put enforce() and friends. Any
 other ideas? Of the above, I like std.checks.

 Better yet, how about defining std.exception that includes a host of
 exception-related functionality (such as defining exceptions that retain
 file and line, perhaps stack traces etc.)?

 The trace functionality already exists in druntime.  As for exceptions,  
 they may belong there as well if they're ones the runtime should be  
 aware of.

How does one get a print out of the stack trace then? Is it a setting or  
something?

Jun 28 2010

Sean Kelly <sean invisibleduck.org> writes:

Rory McGuire Wrote:

 On Mon, 28 Jun 2010 17:36:15 +0200, Sean Kelly <sean invisibleduck.org>  
 wrote:
 
 Andrei Alexandrescu Wrote:
 We haven't reached consensus on where to put enforce() and friends. Any
 other ideas? Of the above, I like std.checks.

 Better yet, how about defining std.exception that includes a host of
 exception-related functionality (such as defining exceptions that retain
 file and line, perhaps stack traces etc.)?

 The trace functionality already exists in druntime.  As for exceptions,  
 they may belong there as well if they're ones the runtime should be  
 aware of.

 
 How does one get a print out of the stack trace then? Is it a setting or  
 something?

I should qualify my original statement by saying that it's only implemented for
Linux and OSX so far.  I have some of the declarations in for the Windows
implementation but haven't gotten to it yet.

Jun 28 2010

"Rory McGuire" <rmcguire neonova.co.za> writes:

On Mon, 28 Jun 2010 18:01:55 +0200, Sean Kelly <sean invisibleduck.org>  
wrote:

 Rory McGuire Wrote:

 On Mon, 28 Jun 2010 17:36:15 +0200, Sean Kelly <sean invisibleduck.org>
 wrote:

 Andrei Alexandrescu Wrote:
 We haven't reached consensus on where to put enforce() and friends.  


 Any
 other ideas? Of the above, I like std.checks.

 Better yet, how about defining std.exception that includes a host of
 exception-related functionality (such as defining exceptions that  


 retain
 file and line, perhaps stack traces etc.)?

 The trace functionality already exists in druntime.  As for  

 exceptions,
 they may belong there as well if they're ones the runtime should be
 aware of.

 How does one get a print out of the stack trace then? Is it a setting or
 something?

 I should qualify my original statement by saying that it's only  
 implemented for Linux and OSX so far.  I have some of the declarations  
 in for the Windows implementation but haven't gotten to it yet.

Is there a way to get the function name/line? I'm using this on ubuntu  
10.04.


void fun() {
	throw new Exception("eeek");
}

void main() {
	fun();
}

Output is:


object.Exception: eeek
----------------
../throw() [0x80493a0]
../throw() [0x804ba44]
../throw() [0x804b9a9]
../throw() [0x804ba81]
../throw() [0x804b9a9]
../throw() [0x804b958]
/lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xe6) [0xf764cbd6]
../throw() [0x80492b1]


-Rory

Jun 28 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Sean Kelly wrote:
 Rory McGuire Wrote:
 
 On Mon, 28 Jun 2010 17:36:15 +0200, Sean Kelly <sean invisibleduck.org>  
 wrote:

 Andrei Alexandrescu Wrote:
 We haven't reached consensus on where to put enforce() and friends. Any
 other ideas? Of the above, I like std.checks.

 Better yet, how about defining std.exception that includes a host of
 exception-related functionality (such as defining exceptions that retain
 file and line, perhaps stack traces etc.)?

 The trace functionality already exists in druntime.  As for exceptions,  
 they may belong there as well if they're ones the runtime should be  
 aware of.

 How does one get a print out of the stack trace then? Is it a setting or  
 something?

 
 I should qualify my original statement by saying that it's only implemented
for Linux and OSX so far.  I have some of the declarations in for the Windows
implementation but haven't gotten to it yet.

My stack traces look indecipherable on Ubuntu. They only contain module 
name and memory address.

Andrei

Jun 28 2010

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Sun, 27 Jun 2010 18:09:02 -0500, Andrei Alexandrescu wrote:

 On 06/17/2010 04:10 AM, Lars T. Kyllingstad wrote:
 On Wed, 16 Jun 2010 07:31:39 -0700, Andrei Alexandrescu wrote:

 Michel Fortin wrote:
 On 2010-06-16 05:15:24 -0400, Walter Bright
 <newshound1 digitalmars.com>  said:

 The difference is not based on those 3 points, but on what Andrei
 wrote here. Contracts and error checking are completely distinct
 activities and should not be conflated.

 True.

 Yet, enforce is inside std.contracts. If that isn't conflating the
 two concepts I wonder what it is. :-)

 You're right! I think Lars' suggestion is sensible - we should move
 enforce to object. Better yet we should find a better name for
 std.contracts. Ideas?

 Andrei


 A few suggestions (even though I still think it belongs in object.d),
 in no particular order:

 std.enforce
 std.assumptions
 std.constraints
 std.checks
 std.tests
 std.error
 std.errcheck

 -Lars

 
 We haven't reached consensus on where to put enforce() and friends. Any
 other ideas? Of the above, I like std.checks.
 
 Better yet, how about defining std.exception that includes a host of
 exception-related functionality (such as defining exceptions that retain
 file and line, perhaps stack traces etc.)?


TDPL mentions several times that enforce() is in std.contracts.  Doesn't 
that preclude moving it or renaming the module?

-Lars

Jun 28 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Lars T. Kyllingstad wrote:
 On Sun, 27 Jun 2010 18:09:02 -0500, Andrei Alexandrescu wrote:
 
 On 06/17/2010 04:10 AM, Lars T. Kyllingstad wrote:
 On Wed, 16 Jun 2010 07:31:39 -0700, Andrei Alexandrescu wrote:

 Michel Fortin wrote:
 On 2010-06-16 05:15:24 -0400, Walter Bright
 <newshound1 digitalmars.com>  said:

 The difference is not based on those 3 points, but on what Andrei
 wrote here. Contracts and error checking are completely distinct
 activities and should not be conflated.

 True.

 Yet, enforce is inside std.contracts. If that isn't conflating the
 two concepts I wonder what it is. :-)

 You're right! I think Lars' suggestion is sensible - we should move
 enforce to object. Better yet we should find a better name for
 std.contracts. Ideas?

 Andrei

 A few suggestions (even though I still think it belongs in object.d),
 in no particular order:

 std.enforce
 std.assumptions
 std.constraints
 std.checks
 std.tests
 std.error
 std.errcheck

 -Lars

 We haven't reached consensus on where to put enforce() and friends. Any
 other ideas? Of the above, I like std.checks.

 Better yet, how about defining std.exception that includes a host of
 exception-related functionality (such as defining exceptions that retain
 file and line, perhaps stack traces etc.)?

 
 
 TDPL mentions several times that enforce() is in std.contracts.  Doesn't 
 that preclude moving it or renaming the module?

I plan to move it to std.exception in a backward-compatible way (have 
std.conv consist of only one import, then deprecate it).

Andrei

Jun 28 2010

torhu <no spam.invalid> writes:

On 28.06.2010 01:09, Andrei Alexandrescu wrote:
[...]
 We haven't reached consensus on where to put enforce() and friends. Any
 other ideas? Of the above, I like std.checks.

 Better yet, how about defining std.exception that includes a host of
 exception-related functionality (such as defining exceptions that retain
 file and line, perhaps stack traces etc.)?

How will std.exception relate to core.exception?  Seems to me having two 
module with that similiar names could easily be confusing.

Jul 05 2010

Walter Bright <newshound1 digitalmars.com> writes:

Michel Fortin wrote:
 On 2010-06-16 05:15:24 -0400, Walter Bright <newshound1 digitalmars.com> 
 said:
 
 The difference is not based on those 3 points, but on what Andrei 
 wrote here. Contracts and error checking are completely distinct 
 activities and should not be conflated.

 
 True.
 
 Yet, enforce is inside std.contracts. If that isn't conflating the two 
 concepts I wonder what it is. :-)

I agree completely. enforce must move.

Jun 16 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Walter Bright wrote:
 Michel Fortin wrote:
 On 2010-06-16 05:15:24 -0400, Walter Bright 
 <newshound1 digitalmars.com> said:

 The difference is not based on those 3 points, but on what Andrei 
 wrote here. Contracts and error checking are completely distinct 
 activities and should not be conflated.

 True.

 Yet, enforce is inside std.contracts. If that isn't conflating the two 
 concepts I wonder what it is. :-)

 
 I agree completely. enforce must move.

Where to?

Andrei

Jun 16 2010

Jonathan M Davis <jmdavisProg gmail.com> writes:

Andrei Alexandrescu wrote:

 Walter Bright wrote:
 Michel Fortin wrote:
 On 2010-06-16 05:15:24 -0400, Walter Bright
 <newshound1 digitalmars.com> said:

 The difference is not based on those 3 points, but on what Andrei
 wrote here. Contracts and error checking are completely distinct
 activities and should not be conflated.

 True.

 Yet, enforce is inside std.contracts. If that isn't conflating the two
 concepts I wonder what it is. :-)

 
 I agree completely. enforce must move.

 
 Where to?
 
 Andrei

I would point out that pretty much nothing in std.contracts actually relates 
to contracts. Rather, it relates to error handling. So, it would probably be 
a good idea to simply rename the module - perhaps to std.error.

- Jonathan M Davis

Jun 16 2010

Michel Fortin <michel.fortin michelf.com> writes:

On 2010-06-16 14:10:17 -0400, Jonathan M Davis <jmdavisProg gmail.com> said:

 I would point out that pretty much nothing in std.contracts actually relates
 to contracts. Rather, it relates to error handling. So, it would probably be
 a good idea to simply rename the module - perhaps to std.error.

I concur: the module is misnamed. The only things not related to error 
handling are assumeUnique and assumeSorted, and I fail to see the link 
with design by contract for either one.


-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Jun 16 2010

Michel Fortin <michel.fortin michelf.com> writes:

On 2010-06-16 14:44:29 -0400, Michel Fortin <michel.fortin michelf.com> said:

 On 2010-06-16 14:10:17 -0400, Jonathan M Davis <jmdavisProg gmail.com> said:
 
 I would point out that pretty much nothing in std.contracts actually relates
 to contracts. Rather, it relates to error handling. So, it would probably be
 a good idea to simply rename the module - perhaps to std.error.

 
 I concur: the module is misnamed. The only things not related to error 
 handling are assumeUnique and assumeSorted, and I fail to see the link 
 with design by contract for either one.

Oh, forgot about "pointsTo" too. What's the link with contracts, or 
error handling?

-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Jun 16 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Michel Fortin wrote:
 On 2010-06-16 14:44:29 -0400, Michel Fortin <michel.fortin michelf.com> 
 said:
 
 On 2010-06-16 14:10:17 -0400, Jonathan M Davis <jmdavisProg gmail.com> 
 said:

 I would point out that pretty much nothing in std.contracts actually 
 relates
 to contracts. Rather, it relates to error handling. So, it would 
 probably be
 a good idea to simply rename the module - perhaps to std.error.

 I concur: the module is misnamed. The only things not related to error 
 handling are assumeUnique and assumeSorted, and I fail to see the link 
 with design by contract for either one.

 
 Oh, forgot about "pointsTo" too. What's the link with contracts, or 
 error handling?

Certain functions (notably swap) must make sure that there's no mutual 
aliasing between two objects.

Andrei

Jun 16 2010

Michel Fortin <michel.fortin michelf.com> writes:

On 2010-06-16 14:59:45 -0400, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 Michel Fortin wrote:
 On 2010-06-16 14:44:29 -0400, Michel Fortin <michel.fortin michelf.com> said:
 
 On 2010-06-16 14:10:17 -0400, Jonathan M Davis <jmdavisProg gmail.com> said:
 
 I would point out that pretty much nothing in std.contracts actually relates
 to contracts. Rather, it relates to error handling. So, it would probably be
 a good idea to simply rename the module - perhaps to std.error.

 
 I concur: the module is misnamed. The only things not related to error 
 handling are assumeUnique and assumeSorted, and I fail to see the link 
 with design by contract for either one.

 
 Oh, forgot about "pointsTo" too. What's the link with contracts, or 
 error handling?

 
 Certain functions (notably swap) must make sure that there's no mutual 
 aliasing between two objects.

Ok, so you're using "pointsTo" to check this in a contract? But isn't 
that just a utility function which can be used for contracts as much as 
for everything else? Does it really belong in std.contracts because at 
some place you use it in a contract? I don't think so. But that's 
something for you to decide. And unfortunately I'm not sure where you 
put it.

-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Jun 16 2010

Michel Fortin <michel.fortin michelf.com> writes:

On 2010-06-16 20:45:47 -0400, Michel Fortin <michel.fortin michelf.com> said:

 Ok, so you're using "pointsTo" to check this in a contract? But isn't 
 that just a utility function which can be used for contracts as much as 
 for everything else? Does it really belong in std.contracts because at 
 some place you use it in a contract? I don't think so. But that's 
 something for you to decide. And unfortunately I'm not sure where you 
 put it.

Should have concluded by: "I'm not sure where you *should* put it either."

-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Jun 16 2010

Walter Bright <newshound1 digitalmars.com> writes:

Andrei Alexandrescu wrote:
 Walter Bright wrote:
 Michel Fortin wrote:
 On 2010-06-16 05:15:24 -0400, Walter Bright 
 <newshound1 digitalmars.com> said:

 The difference is not based on those 3 points, but on what Andrei 
 wrote here. Contracts and error checking are completely distinct 
 activities and should not be conflated.

 True.

 Yet, enforce is inside std.contracts. If that isn't conflating the 
 two concepts I wonder what it is. :-)

 I agree completely. enforce must move.

 
 Where to?

Dunno.

Jun 16 2010

Don <nospam nospam.com> writes:

Walter Bright wrote:
 Andrei Alexandrescu wrote:
 Walter Bright wrote:
 Michel Fortin wrote:
 On 2010-06-16 05:15:24 -0400, Walter Bright 
 <newshound1 digitalmars.com> said:

 The difference is not based on those 3 points, but on what Andrei 
 wrote here. Contracts and error checking are completely distinct 
 activities and should not be conflated.

 True.

 Yet, enforce is inside std.contracts. If that isn't conflating the 
 two concepts I wonder what it is. :-)

 I agree completely. enforce must move.

 Where to?

 
 Dunno.

import std.dunno;
Works for me.

Jun 16 2010

Walter Bright <newshound1 digitalmars.com> writes:

Don wrote:
 import std.dunno;
 Works for me.

cut & print.

Jun 16 2010

=?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:

Don wrote:

 import std.dunno;
 Works for me.

Or std.poisson... :p

Ali

Jun 16 2010

biozic <dransic free.fr> writes:

Le 16/06/10 22:36, Ali Çehreli a écrit :
 Don wrote:

 import std.dunno;
 Works for me.

 Or std.poisson... :p

Better name it std.fishy, because std.poisson could be mistaken for a 
statistical distribution module!

Jun 16 2010

bearophile <bearophileHUGS lycos.com> writes:

Sorry for not answering before, I was quite busy (despite in the meantime I
have written few posts and bug reports).

Thank you to all the people that have answered in this thread, and expecially
Walter that has given the first answer that I have understood in the thread.

In the beginning I didn't like the enforce() but now I can see that it's meant
for a quite different purpose. I was also worried of possible negative
performance impact coming from its so widespread usage in Phobos.


Regarding the exception hierarchy I think I agree with Leandro Lucarella. A
deep and complex exception hierarchy can be negative and overkill, the opposite
extrema, that is having zero specialized exceptions in Phobos, is bad. Using
just single-exception enforce() is bad, there is a small number of well chosen
exception types (organized in a flat or mostly flat list) that are useful to
have.

In my dlibs1 I have defined a flat list of few exceptions that I used:

ArgumentException     
EmptyException        
ExceptionTemplate     
IndexException        
IOException           
KeyException          
MissingMethodException
OverflowException     
RangeException        
UncomparableException 


Inside Phobos2 I have counted about 160 usages of the "body" keyword. I think
contract programming can be used more often inside Phobos2 (and maybe some
usages of enforce() can be turned into contract programming because they are
more similar to program sanity checks).

Bye,
bearophile

Jun 19 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 06/19/2010 03:55 PM, bearophile wrote:
 Inside Phobos2 I have counted about 160 usages of the "body" keyword.
 I think contract programming can be used more often inside Phobos2
 (and maybe some usages of enforce() can be turned into contract
 programming because they are more similar to program sanity checks).

Walter and I discussed this and concluded that Phobos should handle its 
parameters as user input. Therefore they need to be scrubbed with hard 
tests, not contracts.

Andrei

Jun 19 2010

Lutger <lutger.blijdestijn gmail.com> writes:

Andrei Alexandrescu wrote:

 On 06/19/2010 03:55 PM, bearophile wrote:
 Inside Phobos2 I have counted about 160 usages of the "body" keyword.
 I think contract programming can be used more often inside Phobos2
 (and maybe some usages of enforce() can be turned into contract
 programming because they are more similar to program sanity checks).

 
 Walter and I discussed this and concluded that Phobos should handle its
 parameters as user input. Therefore they need to be scrubbed with hard
 tests, not contracts.
 
 Andrei

That is sensible. Are private functions (those only called from within Phobos) 
excluded from this rule?

Jun 19 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 06/19/2010 05:01 PM, Lutger wrote:
 Andrei Alexandrescu wrote:

 On 06/19/2010 03:55 PM, bearophile wrote:
 Inside Phobos2 I have counted about 160 usages of the "body" keyword.
 I think contract programming can be used more often inside Phobos2
 (and maybe some usages of enforce() can be turned into contract
 programming because they are more similar to program sanity checks).

 Walter and I discussed this and concluded that Phobos should handle its
 parameters as user input. Therefore they need to be scrubbed with hard
 tests, not contracts.

 Andrei

 That is sensible. Are private functions (those only called from within Phobos)
 excluded from this rule?

Yes, precisely. (The actual code does not fully obey this intention; 
there are contracts in places where there shouldn't be.)

Andrei

Jun 19 2010

Walter Bright <newshound1 digitalmars.com> writes:

Andrei Alexandrescu wrote:
 On 06/19/2010 03:55 PM, bearophile wrote:
 Inside Phobos2 I have counted about 160 usages of the "body" keyword.
 I think contract programming can be used more often inside Phobos2
 (and maybe some usages of enforce() can be turned into contract
 programming because they are more similar to program sanity checks).

 
 Walter and I discussed this and concluded that Phobos should handle its 
 parameters as user input. Therefore they need to be scrubbed with hard 
 tests, not contracts.

I should add that any library that may be used as a dll should have its 
interface API checked with hard tests, not contracts. This is because a dll 
cannot control who connects to it, and therefore must regard anything sent to
it 
as unvalidated user input.

Jun 19 2010

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Sun, 20 Jun 2010 03:04:31 +0300, Walter Bright  
<newshound1 digitalmars.com> wrote:

 Andrei Alexandrescu wrote:
 On 06/19/2010 03:55 PM, bearophile wrote:
 Inside Phobos2 I have counted about 160 usages of the "body" keyword.
 I think contract programming can be used more often inside Phobos2
 (and maybe some usages of enforce() can be turned into contract
 programming because they are more similar to program sanity checks).

  Walter and I discussed this and concluded that Phobos should handle  
 its parameters as user input. Therefore they need to be scrubbed with  
 hard tests, not contracts.

 I should add that any library that may be used as a dll should have its  
 interface API checked with hard tests, not contracts. This is because a  
 dll cannot control who connects to it, and therefore must regard  
 anything sent to it as unvalidated user input.

I don't see the logic in this...

Are we talking about validating user input for the sake of security, or  
debugging (catching bugs early)?

If it's for the sake of debugging, debug checks should remain in debug  
builds (that's what contracts are for?). Otherwise, you are stripping the  
programmer of the choice between higher performance or more debug checks.

If it's for the sake of security - parameter validation in DLLs is  
pointless. If you are able to load and call code from inside a DLL, you  
are already able to do everything that the DLL can. DLLs don't have any  
"setuid"-like properties. If we were talking, for example, about syscalls  
for a kernel module (functions called from userland but executed in kernel  
land), then that would be a completely different situation.

Also, I don't think that one rule can apply for everyone. For example, a  
high-performance DLL may specify in its documentation that the function  
parameters are not checked by the DLL and must be valid, otherwise  
undefined behavior will occur. (I believe some Windows APIs do not check  
some parameters and will cause access violations when called with invalid  
parameters.)

-- Best regards,
  Vladimir                            mailto:vladimir thecybershadow.net

Jun 19 2010

Walter Bright <newshound2 digitalmars.com> writes:

Vladimir Panteleev wrote:
 On Sun, 20 Jun 2010 03:04:31 +0300, Walter Bright 
 <newshound1 digitalmars.com> wrote:
 
 Andrei Alexandrescu wrote:
 On 06/19/2010 03:55 PM, bearophile wrote:
 Inside Phobos2 I have counted about 160 usages of the "body" keyword.
 I think contract programming can be used more often inside Phobos2
 (and maybe some usages of enforce() can be turned into contract
 programming because they are more similar to program sanity checks).

  Walter and I discussed this and concluded that Phobos should handle 
 its parameters as user input. Therefore they need to be scrubbed with 
 hard tests, not contracts.

 I should add that any library that may be used as a dll should have 
 its interface API checked with hard tests, not contracts. This is 
 because a dll cannot control who connects to it, and therefore must 
 regard anything sent to it as unvalidated user input.

 
 I don't see the logic in this...
 
 Are we talking about validating user input for the sake of security, or 
 debugging (catching bugs early)?

An input to a dll is user input, and should be validated (for the sake of 
security, and other reasons). Validating it is not debugging.


 If it's for the sake of debugging, debug checks should remain in debug 
 builds (that's what contracts are for?). Otherwise, you are stripping 
 the programmer of the choice between higher performance or more debug 
 checks.
 
 If it's for the sake of security - parameter validation in DLLs is 
 pointless. If you are able to load and call code from inside a DLL, you 
 are already able to do everything that the DLL can. DLLs don't have any 
 "setuid"-like properties. If we were talking, for example, about 
 syscalls for a kernel module (functions called from userland but 
 executed in kernel land), then that would be a completely different 
 situation.

If you, for example, provide a pluggable interface to your browser app, that's 
done using a dll, and you'd better validate anything you get through that
plugin 
interface!

 Also, I don't think that one rule can apply for everyone. For example, a 
 high-performance DLL may specify in its documentation that the function 
 parameters are not checked by the DLL and must be valid, otherwise 
 undefined behavior will occur. (I believe some Windows APIs do not check 
 some parameters and will cause access violations when called with 
 invalid parameters.)

If you don't validate the input, then you must accept the risk.

Jun 20 2010

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Mon, 21 Jun 2010 00:17:28 +0300, Walter Bright  
<newshound2 digitalmars.com> wrote:

 An input to a dll is user input, and should be validated (for the sake  
 of security, and other reasons). Validating it is not debugging.

I don't understand why you're saying this. Security checks in DLL  
functions are pointless, for the reasons I already outlined:

  If it's for the sake of security - parameter validation in DLLs is  
 pointless. If you are able to load and call code from inside a DLL, you  
 are already able to do everything that the DLL can. DLLs don't have any  
 "setuid"-like properties. If we were talking, for example, about  
 syscalls for a kernel module (functions called from userland but  
 executed in kernel land), then that would be a completely different  
 situation.

 If you, for example, provide a pluggable interface to your browser app,  
 that's done using a dll, and you'd better validate anything you get  
 through that plugin interface!

Why? When your application loads a DLL, the DLL instantly gets access to  
all of your application's memory, handles, and other resources. It's  
running in the same address space and security context. You need to  
completely trust the DLL - which is why new browsers (Google Chrome and  
experimental Firefox versions) load plugins in separate processes with  
reduced privileges.

-- Best regards,
  Vladimir                            mailto:vladimir thecybershadow.net

Jun 20 2010

BCS <none anon.com> writes:

Hello Vladimir,

 On Mon, 21 Jun 2010 00:17:28 +0300, Walter Bright
 <newshound2 digitalmars.com> wrote:
 
 An input to a dll is user input, and should be validated (for the
 sake  of security, and other reasons). Validating it is not
 debugging.
 

 I don't understand why you're saying this. Security checks in DLL
 functions are pointless, for the reasons I already outlined:
 

import my.dll;

void fn()
{
    auto data = get.userUncheckedInput();
    my.dll.doSomething(data); // if doSomething dosn't check it's inputs, 
then this can cause a security flaw
}

Yes that's your dll's user's fault but adding the checks solves it even so. 
To boot, it reduces your support cost (as long as people read error message) 
and prevents the user from having to debug starting deep inside your dll.

 If it's for the sake of security - parameter validation in DLLs is
 pointless. If you are able to load and call code from inside a DLL,
 you  are already able to do everything that the DLL can. DLLs don't
 have any  "setuid"-like properties. If we were talking, for example,
 about  syscalls for a kernel module (functions called from userland
 but  executed in kernel land), then that would be a completely
 different  situation.
 

 If you, for example, provide a pluggable interface to your browser
 app,  that's done using a dll, and you'd better validate anything you
 get  through that plugin interface!
 

 Why? When your application loads a DLL, the DLL instantly gets access
 to  all of your application's memory, handles, and other resources.
 It's  running in the same address space and security context. You need
 to  completely trust the DLL - which is why new browsers (Google
 Chrome and  experimental Firefox versions) load plugins in separate
 processes with  reduced privileges.

And you can bet that every byte of data shipped back and forth via IPC is 
validated more than an air traveler at a TSA checkpoint.

As for the case where the dll is local, never attribute to malice that which 
can be adequately explained by stupidity. Unless you have source, you can't 
assume that the data coming out doesn't conation unvalidated user input and 
you should always assume that someone malicious will get ahold of that sooner 
or later.  
 
-- 
... <IXOYE><

Jun 20 2010

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Mon, 21 Jun 2010 03:02:53 +0300, BCS <none anon.com> wrote:

 import my.dll;

 void fn()
 {
     auto data = get.userUncheckedInput();
     my.dll.doSomething(data); // if doSomething dosn't check it's  
 inputs, then this can cause a security flaw
 }

 Yes that's your dll's user's fault but adding the checks solves it even  
 so. To boot, it reduces your support cost (as long as people read error  
 message) and prevents the user from having to debug starting deep inside  
 your dll.

A well-designed application needs to validate unsafe user input exactly  
once (assuming the process of validation is the same).
DLL interfaces must specify whether the input/output can be considered  
safe or not.
Not doing so results in either security holes or redundant code.

 And you can bet that every byte of data shipped back and forth via IPC  
 is validated more than an air traveler at a TSA checkpoint.

Obviously, no argument here.

 As for the case where the dll is local, never attribute to malice that  
 which can be adequately explained by stupidity. Unless you have source,  
 you can't assume that the data coming out doesn't conation unvalidated  
 user input and you should always assume that someone malicious will get  
 ahold of that sooner or later.

Unless you have the source, you can't assume that simply loading the DLL  
will not create a security hole.

Trusting the DLL but not trusting the data it gives you is a plausible  
case. As I said before, this simply needs to be well-defined, and  
validation doesn't have to happen exactly at the DLL boundary.


-- Best regards,
  Vladimir                            mailto:vladimir thecybershadow.net

Jun 20 2010

BCS <none anon.com> writes:

Hello Vladimir,

 On Mon, 21 Jun 2010 03:02:53 +0300, BCS <none anon.com> wrote:
 
 import my.dll;
 
 void fn()
 {
 auto data = get.userUncheckedInput();
 my.dll.doSomething(data); // if doSomething dosn't check it's
 inputs, then this can cause a security flaw
 }
 Yes that's your dll's user's fault but adding the checks solves it
 even  so. To boot, it reduces your support cost (as long as people
 read error  message) and prevents the user from having to debug
 starting deep inside  your dll.
 

 A well-designed application needs to validate unsafe user input
 exactly
 once (assuming the process of validation is the same).
 DLL interfaces must specify whether the input/output can be considered
 safe or not.
 Not doing so results in either security holes or redundant code.

If I didn't write the DLL I'm calling, I'll assume it doesn't check stuff. 
If I didn't write the code calling my DLL, I'll assume it doesn't check stuff. 
Why should I assume that the documentation is right or that people will even 
read my documentation? Unless you can show me that this causes a perf problem, 
the benefit just isn't worth the cost.

 And you can bet that every byte of data shipped back and forth via
 IPC  is validated more than an air traveler at a TSA checkpoint.
 

 Obviously, no argument here.
 
 As for the case where the dll is local, never attribute to malice
 that  which can be adequately explained by stupidity. Unless you have
 source,  you can't assume that the data coming out doesn't conation
 unvalidated  user input and you should always assume that someone
 malicious will get  ahold of that sooner or later.
 

 Unless you have the source, you can't assume that simply loading the
 DLL  will not create a security hole.

OK, but just because there might be a risk in loading a DLL is no reason 
to not address other risk that you can deal with.

 
 Trusting the DLL but not trusting the data it gives you is a plausible
 case. As I said before, this simply needs to be well-defined, and
 validation doesn't have to happen exactly at the DLL boundary.

Good point, I might not check it exactly at the call site, but *I* will will 
check it because I will assume that any checks on the other side of a DLL 
interface are flawed, missing, broken or just flat wrong.

-- 
... <IXOYE><

Jun 20 2010

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Mon, 21 Jun 2010 04:53:46 +0300, BCS <none anon.com> wrote:

 If I didn't write the DLL I'm calling, I'll assume it doesn't check  
 stuff. If I didn't write the code calling my DLL, I'll assume it doesn't  
 check stuff. Why should I assume that the documentation is right or that  
 people will even read my documentation? Unless you can show me that this  
 causes a perf problem, the benefit just isn't worth the cost.

If you can't trust the DLL to perform correct user data validation, you  
can't trust it AT ALL! For all you know it can have a buffer overflow  
vulnerability. Re-validating any data you get from it may save you from  
one type of bug, but it doesn't improve security by much overall.

Regarding performance: what is not a "performance problem" in any one  
single place can make a considerable difference when you sum up all the  
redundant checks in your entire codebase.

A practical example from the industry:

For Microsoft partners, Windows is available in a "free" (or retail) build  
and a "checked" build [1].
Since driver code runs in kernel space, drivers can crash the entire  
system anyway - for this reason, there are few checks or kernel mode APIs.  
(That's why sometimes when debugging BSoDs, crashes will happen in a  
completely unrelated kernel module.)
However, if you need to debug your driver, you run it on the checked  
version of Windows, which additionally to having lots of debug checks is  
also built without most compiler optimizations.

   [1]: http://msdn.microsoft.com/en-us/library/ff543450(VS.85).aspx

-- Best regards,
  Vladimir                            mailto:vladimir thecybershadow.net

Jun 21 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 06/20/2010 06:18 PM, Vladimir Panteleev wrote:
 On Mon, 21 Jun 2010 00:17:28 +0300, Walter Bright
 <newshound2 digitalmars.com> wrote:

 An input to a dll is user input, and should be validated (for the sake
 of security, and other reasons). Validating it is not debugging.

 I don't understand why you're saying this. Security checks in DLL
 functions are pointless, for the reasons I already outlined:

[snip]

I think the matter is simpler than that. Essentially DbC is modular 
integrity checking. If Phobos enforce()s parameters in calls, then it 
considers its own integrity a different matter than the integrity of the 
application it's used with.

If Phobos used contracts to validate parameters, it would directly share 
responsibility for the integrity of the entire application. That way, 
users will not be sure whether the failure is a bug in Phobos or one in 
their own code.


Andrei

Jun 20 2010

Walter Bright <newshound2 digitalmars.com> writes:

Vladimir Panteleev wrote:
 On Mon, 21 Jun 2010 00:17:28 +0300, Walter Bright 
 <newshound2 digitalmars.com> wrote:
 
 An input to a dll is user input, and should be validated (for the sake 
 of security, and other reasons). Validating it is not debugging.

 
 I don't understand why you're saying this. Security checks in DLL 
 functions are pointless, for the reasons I already outlined:

It's true that whenever user code is executed, that code can do anything.
Hello, 
ActiveX. But I still think it's sound practice to treat any data received from 
another program as untrusted, and validate it. Security, like I said, is only 
one reason. Another is to prevent bugs in external code from trashing your
process.

Jun 20 2010

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Mon, 21 Jun 2010 03:40:48 +0300, Walter Bright  
<newshound2 digitalmars.com> wrote:

 Vladimir Panteleev wrote:
 On Mon, 21 Jun 2010 00:17:28 +0300, Walter Bright  
 <newshound2 digitalmars.com> wrote:

 An input to a dll is user input, and should be validated (for the sake  
 of security, and other reasons). Validating it is not debugging.

  I don't understand why you're saying this. Security checks in DLL  
 functions are pointless, for the reasons I already outlined:

 It's true that whenever user code is executed, that code can do  
 anything. Hello, ActiveX. But I still think it's sound practice to treat  
 any data received from another program as untrusted, and validate it.  
 Security, like I said, is only one reason. Another is to prevent bugs in  
 external code from trashing your process.

Yes, but this is a completely different kind of trust (incompetence  
instead of intentional malice) :)

I was simply arguing the technical point of pointlessness of verifying  
data from DLLs specifically for security reasons (buffer overflows, code  
injection etc.).

Other than that, this is the usual performance vs. robustness dilemma  
(though my personal opinion is that an ideal language/platform/etc. should  
allow programmers to take all the responsibility for maximum performance).

-- Best regards,
  Vladimir                            mailto:vladimir thecybershadow.net

Jun 20 2010

Leandro Lucarella <luca llucax.com.ar> writes:

Walter Bright, el 20 de junio a las 17:40 me escribiste:
 Vladimir Panteleev wrote:
On Mon, 21 Jun 2010 00:17:28 +0300, Walter Bright
<newshound2 digitalmars.com> wrote:

An input to a dll is user input, and should be validated (for
the sake of security, and other reasons). Validating it is not
debugging.

I don't understand why you're saying this. Security checks in DLL
functions are pointless, for the reasons I already outlined:

 
 It's true that whenever user code is executed, that code can do
 anything. Hello, ActiveX. But I still think it's sound practice to
 treat any data received from another program as untrusted, and
 validate it. Security, like I said, is only one reason. Another is
 to prevent bugs in external code from trashing your process.

How can you prevent that? If you pass incorrect data to a DLL, then the
bug is *yours*. If the DLL has a bug, it will explode anyways. You are
just trying to catch programs bugs in the DLL, which seems overly
patronizing to me. Why will you assume I'm so dumb that I won't use your
interface correctly?

-- 
Leandro Lucarella (AKA luca)                     http://llucax.com.ar/
----------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------
Karma police
arrest this girl,
her Hitler hairdo
is making me feel ill
and we have crashed her party.

Jun 20 2010

Walter Bright <newshound2 digitalmars.com> writes:

Leandro Lucarella wrote:
 Why will you assume I'm so dumb that I won't use your
 interface correctly?

Windows has had major legacy compatibility issues because critical third party 
applications misused the APIs.

People *will* misuse your API, and you will get blamed for it. It's unfair, but 
that's how it works.

Jun 20 2010

Leandro Lucarella <luca llucax.com.ar> writes:

Walter Bright, el 20 de junio a las 19:32 me escribiste:
 Leandro Lucarella wrote:
Why will you assume I'm so dumb that I won't use your
interface correctly?

 
 Windows has had major legacy compatibility issues because critical
 third party applications misused the APIs.
 
 People *will* misuse your API, and you will get blamed for it. It's
 unfair, but that's how it works.

Luckily I haven't used Windows for about 10 years now =)

It's really a shame that D will take the stupidity route.

PS: I don't know how windows work, but if calling the Windows API is
    like going into kernel mode, and you can mess other processes, it
    seems reasonable to do check every API call as if it were user
    input, but if you're confined to your process, is really stupid.

-- 
Leandro Lucarella (AKA luca)                     http://llucax.com.ar/
----------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------
Vaporeso sostenía a rajacincha la teoría del No-Water, la cual le
pertenecía y versaba lo siguiente: "Para darle la otra mejilla al fuego,
éste debe ser apagado con alpargatas apenas húmedas".

Jun 20 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 06/20/2010 11:08 PM, Leandro Lucarella wrote:
 Walter Bright, el 20 de junio a las 19:32 me escribiste:
 Leandro Lucarella wrote:
 Why will you assume I'm so dumb that I won't use your
 interface correctly?

 Windows has had major legacy compatibility issues because critical
 third party applications misused the APIs.

 People *will* misuse your API, and you will get blamed for it. It's
 unfair, but that's how it works.

 Luckily I haven't used Windows for about 10 years now =)

 It's really a shame that D will take the stupidity route.

 PS: I don't know how windows work, but if calling the Windows API is
      like going into kernel mode, and you can mess other processes, it
      seems reasonable to do check every API call as if it were user
      input, but if you're confined to your process, is really stupid.

Why is it stupid?

Andrei

Jun 21 2010

Leandro Lucarella <luca llucax.com.ar> writes:

Andrei Alexandrescu, el 21 de junio a las 08:02 me escribiste:
 On 06/20/2010 11:08 PM, Leandro Lucarella wrote:
Walter Bright, el 20 de junio a las 19:32 me escribiste:
Leandro Lucarella wrote:
Why will you assume I'm so dumb that I won't use your
interface correctly?

Windows has had major legacy compatibility issues because critical
third party applications misused the APIs.

People *will* misuse your API, and you will get blamed for it. It's
unfair, but that's how it works.

Luckily I haven't used Windows for about 10 years now =)

It's really a shame that D will take the stupidity route.

PS: I don't know how windows work, but if calling the Windows API is
     like going into kernel mode, and you can mess other processes, it
     seems reasonable to do check every API call as if it were user
     input, but if you're confined to your process, is really stupid.

 
 Why is it stupid?

Because you're adding unnecessary extra checks, just based on
(Windows?) programmer's stupidity.

-- 
Leandro Lucarella (AKA luca)                     http://llucax.com.ar/
----------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------
JUGAR COMPULSIVAMENTE ES PERJUDICIAL PARA LA SALUD.
	-- Casino de Mar del Plata

Jun 21 2010

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Mon, 21 Jun 2010 16:30:48 +0300, Leandro Lucarella <luca llucax.com.ar>  
wrote:

 Andrei Alexandrescu, el 21 de junio a las 08:02 me escribiste:
 On 06/20/2010 11:08 PM, Leandro Lucarella wrote:
Walter Bright, el 20 de junio a las 19:32 me escribiste:
Leandro Lucarella wrote:
Why will you assume I'm so dumb that I won't use your
interface correctly?

Windows has had major legacy compatibility issues because critical
third party applications misused the APIs.

People *will* misuse your API, and you will get blamed for it. It's
unfair, but that's how it works.

Luckily I haven't used Windows for about 10 years now =)

It's really a shame that D will take the stupidity route.

PS: I don't know how windows work, but if calling the Windows API is
     like going into kernel mode, and you can mess other processes, it
     seems reasonable to do check every API call as if it were user
     input, but if you're confined to your process, is really stupid.

 Why is it stupid?

 Because you're adding unnecessary extra checks, just based on
 (Windows?) programmer's stupidity.

Walter makes a good point. If someone uses your API in the wrong way and  
relies on undocumented/undefined behavior, you'll end up having to support  
this usage pattern in future implementations of your interface if you want  
businesses and other entities who depend on that product to buy your new  
operating system.

-- Best regards,
  Vladimir                            mailto:vladimir thecybershadow.net

Jun 21 2010

Sean Kelly <sean invisibleduck.org> writes:

Andrei Alexandrescu Wrote:

 On 06/20/2010 06:18 PM, Vladimir Panteleev wrote:
 On Mon, 21 Jun 2010 00:17:28 +0300, Walter Bright
 <newshound2 digitalmars.com> wrote:

 An input to a dll is user input, and should be validated (for the sake
 of security, and other reasons). Validating it is not debugging.

 I don't understand why you're saying this. Security checks in DLL
 functions are pointless, for the reasons I already outlined:

 [snip]
 
 I think the matter is simpler than that. Essentially DbC is modular 
 integrity checking. If Phobos enforce()s parameters in calls, then it 
 considers its own integrity a different matter than the integrity of the 
 application it's used with.
 
 If Phobos used contracts to validate parameters, it would directly share 
 responsibility for the integrity of the entire application. That way, 
 users will not be sure whether the failure is a bug in Phobos or one in 
 their own code.

If a unrecoverable failure occurs within the process, does it matter where the
error originated?

I've been thinking about this a bit and am starting to wonder about the benefit
of distinguishing API boundary integrity checking vs. internal integrity
checking.  First, what if a library eats its own dogfood?  If my library
provides a public method to spawn threads and the library itself uses threads
internally then I have two different methods of checking the integrity of my
own library, each possibly throwing different exceptions (enforce throws
Exception while assert throws AssertError).  At the very least, this seems like
it could cause maintenance issues because a logic change deep within the
library may require additional exception handling to deal with what are
intended to be user-facing errors.

In a similar vein, if contracts are used within an API but a different mode of
checking is used at the API boundary, when contracts are enabled in that API
the user is faced with the bizarre issue of receiving AssertErrors from
internal API logic errors but only Exceptions from his own logic errors for API
boundary calls.  When you say that DbC is modular I'd presume that means it's
encapsulated within each distinct subsystem, but it seems like this isn't true
at all.  The alternative is to turn DbC off in the library and either live with
undefined behavior or a hard crash if there's a bug in the library, a
circumstance which is again forced upon the user.

Regarding DbC, I can't say that I've ever worked on a system where lives hung
in the balance (an admittedly extreme example of where DbC is useful), but if I
were sufficiently concerned about process integrity that I had contracts
enabled then I don't think I would trust that a third-party library was
bug-free and therefore didn't need its own contract checking in place.  Once
I've accepted the cost of integrity checking I want it everywhere, not just in
my own code.  It makes for consistent error checking behavior (I'd assume there
is a system in place to trap DbC errors specifically) and provides full-process
integrity checking.

I think the only boundary that really matters is the process boundary.  Any
error within a process has the same effect regardless of whether it's in user
code or library code--corrupted memory, segfaults, etc--so why make a
distinction between code I wrote and code someone else wrote?  By the same
token, if the user chooses to disable contracts then why force them upon him
for some errors but not others?  The user is making the explicit choice to run
his process through a meat-grinder if something unexpected happens, he's
effectively asserting that his code is perfect, so why tell him that "no, it's
actually not."

For me, the more difficult issue is how much integrity checking should be done.
 For example, I created an AVL tree a while back that verified that the tree
was still properly balanced after every API call.  This was great from a code
verification standpoint, but the check completely violated the complexity
guarantees and ultimately checked something that could have been proven to a
reasonable degree of confidence through code reviews and unit testing.  Should
checks like this be enabled automatically when DbC is turned on?  Should there
be different levels of DbC?  In some respects I feel like there's a difference
between in/out contracts and invariants, but even that doesn't seem completely
right.  Thoughts?

Jun 21 2010

bearophile <bearophileHUGS lycos.com> writes:

Sean Kelly:

First, what if a library eats its own dogfood? If my library provides a public
method to spawn threads and the library itself uses threads internally then I
have two different methods of checking the integrity of my own library, each
possibly throwing different exceptions (enforce throws Exception while assert
throws AssertError).<

I think that Design by contract, to be useful, needs to be embraced. You need
to trust it and use it everywhere. Now I use DbC quite often in my D code and I
appreciate it. The DbC feature I miss mostly is the "old" (view of the original
input data).

For example, I created an AVL tree a while back that verified that the tree was
still properly balanced after every API call. This was great from a code
verification standpoint, but the check completely violated the complexity
guarantees and ultimately checked something that could have been proven to a
reasonable degree of confidence through code reviews and unit testing. Should
checks like this be enabled automatically when DbC is turned on? Should there
be different levels of DbC? In some respects I feel like there's a difference
between in/out contracts and invariants, but even that doesn't seem completely
right. Thoughts?<

Using Design by Contract is not easy, you need to train yourself to use it
well. A problem is that DbC is uncommon, only Eiffel and few other languages
use it seriously, so lot of D users have to learn DbC on D itself.

I face your problem putting inside the contracts code that doesn't change the
complexity of the code it guards (so for example if the code is O(n^2) I don't
add contracts that perform O(n^3) computations).

Then where it's useful I add stronger tests (that can be slower) inside
debug{}. You can see it here too:
http://www.digitalmars.com/pnews/read.php?server=news.digitalmars.com&group=digitalmars.D&artnum=109395&header

Inside the invariant there is O(1) test code, and it also contains inside a
debug{} O(n) test code (that is .

In debug mode I want to test the code very well, while in normal nonrelease
mode I can accept less stringent tests that make the code usable.

Unittests and DbC (and integration tests, functional tests, smoke tests, etc)
are both useful, they do different things :-) For example an unittest can tell
me a function is wrong, but a loop invariant can tell me where the bug is and
when it happens inside the function :-)

Bye,
bearophile

Jun 21 2010

Walter Bright <newshound2 digitalmars.com> writes:

Sean Kelly wrote:
 Regarding DbC, I can't say that I've ever worked on a system where lives hung
 in the balance (an admittedly extreme example of where DbC is useful),

I have, and here's how it's done:

http://www.drdobbs.com/blog/archives/2009/10/safe_systems_fr.html

http://www.drdobbs.com/blog/archives/2009/11/designing_safe.html

I really wish this was more widely known in the software engineering business. 
It's frustrating to see it relearned the hard way, over and over.

And not just the software business, I saw a technical overview of the BP oil 
spill failure, and the rig design violated just about every principle of safe 
system design.

Jun 22 2010

Sean Kelly <sean invisibleduck.org> writes:

Walter Bright Wrote:

 Sean Kelly wrote:
 Regarding DbC, I can't say that I've ever worked on a system where lives hung
 in the balance (an admittedly extreme example of where DbC is useful),

 
 I have, and here's how it's done:
 
 http://www.drdobbs.com/blog/archives/2009/10/safe_systems_fr.html
 
 http://www.drdobbs.com/blog/archives/2009/11/designing_safe.html
 
 I really wish this was more widely known in the software engineering business. 
 It's frustrating to see it relearned the hard way, over and over.
 
 And not just the software business, I saw a technical overview of the BP oil 
 spill failure, and the rig design violated just about every principle of safe 
 system design.

A coworker of mine knows a guy who had workers on that rig and told me this
story the other day.  Apparently, there's a system on the drill that when a
failure occurs a cap drops over the hole and shears the drill.  The BP rig was
drilling unusually deep though, and as a result the drill had to be incredibly
hard.  For this and other reasons, the safety system was estimated to have a
70% failure rate.  Furthermore, the rig was known to be on the verge of
failure.  He implored the BP folks to shut it down, but they refused so in
desperation he hired people to fly his team off the rig, fearing for their
safety.  The rig failed a few hours after his team was evacuated.

While I've never worked on systems where lives hang in the balance, I have
worked on systems where 100% uptime is required.  I favor the Erlang approach
where a system is a web of interconnected, redundant processes that terminate
on errors.  I've found this design an extremely hard sell in the internet
server world though.  The design takes more planning and people are in too much
of a hurry.

Jun 22 2010

Sean Kelly <sean invisibleduck.org> writes:

Sean Kelly Wrote:
 
 While I've never worked on systems where lives hang in the balance, I have
worked on systems where 100% uptime is required.  I favor the Erlang approach
where a system is a web of interconnected, redundant processes that terminate
on errors.  I've found this design an extremely hard sell in the internet
server world though.  The design takes more planning and people are in too much
of a hurry.

I should add that I'm hoping the message passing model in D will help encourage
reliable system design.  With thread isolation it should be pretty easy to move
parts of a program into separate processes as need dictates.

Jun 22 2010

Lutger <lutger.blijdestijn gmail.com> writes:

Sean Kelly wrote:

 Sean Kelly Wrote:
 
 While I've never worked on systems where lives hang in the balance, I have
 worked on systems where 100% uptime is required.  I favor the Erlang approach
 where a system is a web of interconnected, redundant processes that terminate
 on errors.  I've found this design an extremely hard sell in the internet
 server world though.  The design takes more planning and people are in too
 much of a hurry.

 
 I should add that I'm hoping the message passing model in D will help
 encourage reliable system design.  With thread isolation it should be pretty
 easy to move parts of a program into separate processes as need dictates.

Can we look forward to seeing ipc supported in phobos via the same interface in 
the future? I really like the api you have created.

Jun 23 2010

Sean Kelly <sean invisibleduck.org> writes:

Lutger <lutger.blijdestijn gmail.com> wrote:
 Sean Kelly wrote:
 
 Sean Kelly Wrote:
 
 While I've never worked on systems where lives hang in the balance,
 I have
 worked on systems where 100% uptime is required.  I favor the Erlang
 approach
 where a system is a web of interconnected, redundant processes that
 terminate
 on errors.  I've found this design an extremely hard sell in the
 internet
 server world though.  The design takes more planning and people are
 in too
 much of a hurry.

 
 I should add that I'm hoping the message passing model in D will help
 encourage reliable system design.  With thread isolation it should be
 pretty
 easy to move parts of a program into separate processes as need
 dictates.

 
 Can we look forward to seeing ipc supported in phobos via the same
 interface in 
 the future? I really like the api you have created.

Yes. The core send/receive API should work just fine for IPC, and it's
definitely on the map.  The greatest obstacle there is probably the need
for a solid serialization/deserialization package in Phobos.

Jun 24 2010

Jacob Carlborg <doob me.com> writes:

On 2010-06-25 05:17, Sean Kelly wrote:
 Lutger<lutger.blijdestijn gmail.com>  wrote:
 Sean Kelly wrote:

 Sean Kelly Wrote:
 While I've never worked on systems where lives hang in the balance,
 I have
 worked on systems where 100% uptime is required.  I favor the Erlang
 approach
 where a system is a web of interconnected, redundant processes that
 terminate
 on errors.  I've found this design an extremely hard sell in the
 internet
 server world though.  The design takes more planning and people are
 in too
 much of a hurry.

 I should add that I'm hoping the message passing model in D will help
 encourage reliable system design.  With thread isolation it should be
 pretty
 easy to move parts of a program into separate processes as need
 dictates.

 Can we look forward to seeing ipc supported in phobos via the same
 interface in
 the future? I really like the api you have created.

 Yes. The core send/receive API should work just fine for IPC, and it's
 definitely on the map.  The greatest obstacle there is probably the need
 for a solid serialization/deserialization package in Phobos.

I have a serialization library, http://dsource.org/projects/orange/ , 
this is a list of some its features:

* It handles both serializing and deserializing
* It automatically serializes the base classes
* It supports events (before and after (de)serializing)
* It supports non-serialized fields (you can say that some fields in a 
class should not be serialized)
* It's licensed under the Boost license
* It's fairly std/runtime library independent
* You can create new archive types and use them with the existing serializer
* Serializes through base class references
* Serializes third party types

Currently it only works using Tango but the only part of the library 
that is dependent on the std/runtime library is XMLArchive, I'm 
currently working on porting it to Phobos. It also needs testing.

Also issue 2844 and the one, can't find it now, about getMembers is not 
implemented at all (returns an empty array).

-- 
/Jacob Carlborg

Jun 25 2010

Jacob Carlborg <doob me.com> writes:

On 2010-06-25 14:54, Jacob Carlborg wrote:
 On 2010-06-25 05:17, Sean Kelly wrote:
 Lutger<lutger.blijdestijn gmail.com> wrote:
 Sean Kelly wrote:

 Sean Kelly Wrote:
 While I've never worked on systems where lives hang in the balance,
 I have
 worked on systems where 100% uptime is required. I favor the Erlang
 approach
 where a system is a web of interconnected, redundant processes that
 terminate
 on errors. I've found this design an extremely hard sell in the
 internet
 server world though. The design takes more planning and people are
 in too
 much of a hurry.

 I should add that I'm hoping the message passing model in D will help
 encourage reliable system design. With thread isolation it should be
 pretty
 easy to move parts of a program into separate processes as need
 dictates.

 Can we look forward to seeing ipc supported in phobos via the same
 interface in
 the future? I really like the api you have created.

 Yes. The core send/receive API should work just fine for IPC, and it's
 definitely on the map. The greatest obstacle there is probably the need
 for a solid serialization/deserialization package in Phobos.

 I have a serialization library, http://dsource.org/projects/orange/ ,
 this is a list of some its features:

 * It handles both serializing and deserializing
 * It automatically serializes the base classes
 * It supports events (before and after (de)serializing)
 * It supports non-serialized fields (you can say that some fields in a
 class should not be serialized)
 * It's licensed under the Boost license
 * It's fairly std/runtime library independent
 * You can create new archive types and use them with the existing
 serializer
 * Serializes through base class references
 * Serializes third party types

 Currently it only works using Tango but the only part of the library
 that is dependent on the std/runtime library is XMLArchive, I'm
 currently working on porting it to Phobos. It also needs testing.

 Also issue 2844 and the one, can't find it now, about getMembers is not
 implemented at all (returns an empty array).

... should be fixed.


-- 
/Jacob Carlborg

Jun 25 2010

"Robert Jacques" <sandford jhu.edu> writes:

On Fri, 25 Jun 2010 08:54:54 -0400, Jacob Carlborg <doob me.com> wrote:

 On 2010-06-25 05:17, Sean Kelly wrote:
 Lutger<lutger.blijdestijn gmail.com>  wrote:
 Sean Kelly wrote:

 Sean Kelly Wrote:
 While I've never worked on systems where lives hang in the balance,
 I have
 worked on systems where 100% uptime is required.  I favor the Erlang
 approach
 where a system is a web of interconnected, redundant processes that
 terminate
 on errors.  I've found this design an extremely hard sell in the
 internet
 server world though.  The design takes more planning and people are
 in too
 much of a hurry.

 I should add that I'm hoping the message passing model in D will help
 encourage reliable system design.  With thread isolation it should be
 pretty
 easy to move parts of a program into separate processes as need
 dictates.

 Can we look forward to seeing ipc supported in phobos via the same
 interface in
 the future? I really like the api you have created.

 Yes. The core send/receive API should work just fine for IPC, and it's
 definitely on the map.  The greatest obstacle there is probably the need
 for a solid serialization/deserialization package in Phobos.

 I have a serialization library, http://dsource.org/projects/orange/ ,  
 this is a list of some its features:

 * It handles both serializing and deserializing
 * It automatically serializes the base classes
 * It supports events (before and after (de)serializing)
 * It supports non-serialized fields (you can say that some fields in a  
 class should not be serialized)
 * It's licensed under the Boost license
 * It's fairly std/runtime library independent
 * You can create new archive types and use them with the existing  
 serializer
 * Serializes through base class references
 * Serializes third party types

 Currently it only works using Tango but the only part of the library  
 that is dependent on the std/runtime library is XMLArchive, I'm  
 currently working on porting it to Phobos. It also needs testing.

 Also issue 2844 and the one, can't find it now, about getMembers is not  
 implemented at all (returns an empty array).

I'll volunteer to help test (and to add JSON capabilities) when you're  
ready.

Jun 25 2010

Jacob Carlborg <doob me.com> writes:

On 2010-06-25 17:40, Robert Jacques wrote:
 On Fri, 25 Jun 2010 08:54:54 -0400, Jacob Carlborg <doob me.com> wrote:

 On 2010-06-25 05:17, Sean Kelly wrote:
 Lutger<lutger.blijdestijn gmail.com> wrote:
 Sean Kelly wrote:

 Sean Kelly Wrote:
 While I've never worked on systems where lives hang in the balance,
 I have
 worked on systems where 100% uptime is required. I favor the Erlang
 approach
 where a system is a web of interconnected, redundant processes that
 terminate
 on errors. I've found this design an extremely hard sell in the
 internet
 server world though. The design takes more planning and people are
 in too
 much of a hurry.

 I should add that I'm hoping the message passing model in D will help
 encourage reliable system design. With thread isolation it should be
 pretty
 easy to move parts of a program into separate processes as need
 dictates.

 Can we look forward to seeing ipc supported in phobos via the same
 interface in
 the future? I really like the api you have created.

 Yes. The core send/receive API should work just fine for IPC, and it's
 definitely on the map. The greatest obstacle there is probably the need
 for a solid serialization/deserialization package in Phobos.

 I have a serialization library, http://dsource.org/projects/orange/ ,
 this is a list of some its features:

 * It handles both serializing and deserializing
 * It automatically serializes the base classes
 * It supports events (before and after (de)serializing)
 * It supports non-serialized fields (you can say that some fields in a
 class should not be serialized)
 * It's licensed under the Boost license
 * It's fairly std/runtime library independent
 * You can create new archive types and use them with the existing
 serializer
 * Serializes through base class references
 * Serializes third party types

 Currently it only works using Tango but the only part of the library
 that is dependent on the std/runtime library is XMLArchive, I'm
 currently working on porting it to Phobos. It also needs testing.

 Also issue 2844 and the one, can't find it now, about getMembers is
 not implemented at all (returns an empty array).

 I'll volunteer to help test (and to add JSON capabilities) when you're
 ready.

It's ready to be tested with D1 and Tango. You can also start building a 
JSON archive, if you use D2 this will also make sure there is no other 
Tango dependencies (other than in XMLArchive). Let me no if you need any 
help.

-- 
/Jacob Carlborg

Jun 25 2010

BCS <none anon.com> writes:

Hello Walter,

 And not just the software business, I saw a technical overview of the
 BP oil spill failure, and the rig design violated just about every
 principle of safe system design.

link by chance?

-- 
... <IXOYE><

Jun 22 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 06/21/2010 01:14 PM, Sean Kelly wrote:
 Andrei Alexandrescu Wrote:

 On 06/20/2010 06:18 PM, Vladimir Panteleev wrote:
 On Mon, 21 Jun 2010 00:17:28 +0300, Walter Bright
 <newshound2 digitalmars.com>  wrote:

 An input to a dll is user input, and should be validated (for
 the sake of security, and other reasons). Validating it is not
 debugging.

 I don't understand why you're saying this. Security checks in
 DLL functions are pointless, for the reasons I already outlined:

 [snip]

 I think the matter is simpler than that. Essentially DbC is
 modular integrity checking. If Phobos enforce()s parameters in
 calls, then it considers its own integrity a different matter than
 the integrity of the application it's used with.

 If Phobos used contracts to validate parameters, it would directly
 share responsibility for the integrity of the entire application.
 That way, users will not be sure whether the failure is a bug in
 Phobos or one in their own code.

 If a unrecoverable failure occurs within the process, does it matter
 where the error originated?

It does matter for the postmortem. Black box ftw.

 I've been thinking about this a bit and am starting to wonder about
 the benefit of distinguishing API boundary integrity checking vs.
 internal integrity checking.  First, what if a library eats its own
 dogfood?  If my library provides a public method to spawn threads and
 the library itself uses threads internally then I have two different
 methods of checking the integrity of my own library, each possibly
 throwing different exceptions (enforce throws Exception while assert
 throws AssertError).  At the very least, this seems like it could
 cause maintenance issues because a logic change deep within the
 library may require additional exception handling to deal with what
 are intended to be user-facing errors.

Any complex API will face at some point some tension between reusing and 
duplicating code. Per the classic joke:

"A mathematician was in a habit of making a cup of tea when working late 
at night. His normal method was to get the teapot from the cupboard, 
take the teapot to the sink, add water, heat to boiling, then make the 
cup of tea. Unfortunately, one night when he went to make tea, the 
teapot was already full of water and sitting on the stove. He thought 
about this for several minutes, then emptied the teapot and put it back 
in the cupboard, thereby reducing this to a previously solved problem."

It's often the case that an API throws some water for the sake of 
reusing itself.

 In a similar vein, if contracts are used within an API but a
 different mode of checking is used at the API boundary, when
 contracts are enabled in that API the user is faced with the bizarre
 issue of receiving AssertErrors from internal API logic errors but
 only Exceptions from his own logic errors for API boundary calls.
 When you say that DbC is modular I'd presume that means it's
 encapsulated within each distinct subsystem, but it seems like this
 isn't true at all.  The alternative is to turn DbC off in the library
 and either live with undefined behavior or a hard crash if there's a
 bug in the library, a circumstance which is again forced upon the
 user.

When I say DbC is modular I have in mind the following: "assert() inside 
a well-defined library entity (e.g. Phobos) is supposed to check the 
integrity of that entity, not the integrity of the entity using it."

I think that's reasonable. Integrity is a cross-cutting concern in 
memory-unsafe programs, but not in memory-safe programs. (Whoa, that's 
interesting.)

 Regarding DbC, I can't say that I've ever worked on a system where
 lives hung in the balance (an admittedly extreme example of where DbC
 is useful), but if I were sufficiently concerned about process
 integrity that I had contracts enabled then I don't think I would
 trust that a third-party library was bug-free and therefore didn't
 need its own contract checking in place.  Once I've accepted the cost
 of integrity checking I want it everywhere, not just in my own code.
 It makes for consistent error checking behavior (I'd assume there is
 a system in place to trap DbC errors specifically) and provides
 full-process integrity checking.

There's a large spectrum between "people will die" etc. and "I don't 
give a flying frak". I think an application under construction may well 
choose to use contracts for itself but not for the well-tested (ahem) 
Phobos.

 I think the only boundary that really matters is the process
 boundary.  Any error within a process has the same effect regardless
 of whether it's in user code or library code--corrupted memory,
 segfaults, etc--so why make a distinction between code I wrote and
 code someone else wrote?

For post-mortem and for assigning blame appropriately. If Phobos used 
DbC on user-passed inputs it would essentially share blame for the 
application integrity with all applications.

 By the same token, if the user chooses to
 disable contracts then why force them upon him for some errors but
 not others?  The user is making the explicit choice to run his
 process through a meat-grinder if something unexpected happens, he's
 effectively asserting that his code is perfect, so why tell him that
 "no, it's actually not."

That's a good point, but imho not enough to challenge the status quo. It 
might make sense to have a "unsafe" build for Phobos that assumes 
absolutely all arguments are correct. I wonder how much of an 
improvement it would bring.

 For me, the more difficult issue is how much integrity checking
 should be done.  For example, I created an AVL tree a while back that
 verified that the tree was still properly balanced after every API
 call.  This was great from a code verification standpoint, but the
 check completely violated the complexity guarantees and ultimately
 checked something that could have been proven to a reasonable degree
 of confidence through code reviews and unit testing.  Should checks
 like this be enabled automatically when DbC is turned on?  Should
 there be different levels of DbC?  In some respects I feel like
 there's a difference between in/out contracts and invariants, but
 even that doesn't seem completely right.  Thoughts?

I take no prisoners there: integrity checks MUST NOT affect complexity. 
Complexity is part of the spec, so such checks would automatically 
violate the spec.

At some point in its history, the binary search functions in 
std.algorithm had a enforce(isSorted) check. Should be somewhere buried 
in the svn history. It was my fault, and I didn't realize it until after 
I waited for one day next to a script to complete.


Andrei

Jun 27 2010

"Rory McGuire" <rmcguire neonova.co.za> writes:

On Mon, 21 Jun 2010 06:08:01 +0200, Leandro Lucarella <luca llucax.com.ar>  
wrote:

 Walter Bright, el 20 de junio a las 19:32 me escribiste:
 Leandro Lucarella wrote:
Why will you assume I'm so dumb that I won't use your
interface correctly?

 Windows has had major legacy compatibility issues because critical
 third party applications misused the APIs.

 People *will* misuse your API, and you will get blamed for it. It's
 unfair, but that's how it works.

 Luckily I haven't used Windows for about 10 years now =)

 It's really a shame that D will take the stupidity route.

 PS: I don't know how windows work, but if calling the Windows API is
     like going into kernel mode, and you can mess other processes, it
     seems reasonable to do check every API call as if it were user
     input, but if you're confined to your process, is really stupid.

I think perhaps you mis-understood, it is mostly not stupidity that causes  
people to use
undocumented "features" of an API but rather, it is people being overly  
"clever".

Jun 21 2010

Sean Kelly <sean invisibleduck.org> writes:

Rory McGuire Wrote:
 
 I think perhaps you mis-understood, it is mostly not stupidity that causes  
 people to use
 undocumented "features" of an API but rather, it is people being overly  
 "clever".

Or sometimes simply desperation.  There are some classes of apps that require
the use of undocumented API calls to operate on Windows--I believe disk
defragmenters are one example.  Microsoft rightly didn't document these calls
because it wasn't prepared to support them long-term, but in doing so they also
prevented users from doing necessary work and effectively forced them into
using API calls that might change unexpectedly.  I think these users accept
this problem and do the necessary verification and updating when new OS
revisions are released however.

Jun 21 2010

Don <nospam nospam.com> writes:

Sean Kelly wrote:
 Rory McGuire Wrote:
 I think perhaps you mis-understood, it is mostly not stupidity that causes  
 people to use
 undocumented "features" of an API but rather, it is people being overly  
 "clever".

 
 Or sometimes simply desperation.  There are some classes of apps that require
the use of undocumented API calls to operate on Windows--I believe disk
defragmenters are one example.  Microsoft rightly didn't document these calls
because it wasn't prepared to support them long-term, but in doing so they also
prevented users from doing necessary work and effectively forced them into
using API calls that might change unexpectedly.  I think these users accept
this problem and do the necessary verification and updating when new OS
revisions are released however.

Remember Windows 3.0? File handling involved undocumented API calls!

Jun 21 2010

Adrian Matoga <epi atari8.info> writes:

Leandro Lucarella pisze:
 Walter Bright, el 20 de junio a las 19:32 me escribiste:
 Leandro Lucarella wrote:
 Why will you assume I'm so dumb that I won't use your
 interface correctly?

 Windows has had major legacy compatibility issues because critical
 third party applications misused the APIs.

 People *will* misuse your API, and you will get blamed for it. It's
 unfair, but that's how it works.

 
 Luckily I haven't used Windows for about 10 years now =)
 
 It's really a shame that D will take the stupidity route.
 
 PS: I don't know how windows work, but if calling the Windows API is
     like going into kernel mode, and you can mess other processes, it
     seems reasonable to do check every API call as if it were user
     input, but if you're confined to your process, is really stupid.
 

It was 15 years ago, at the times of 3.x and 95, when Windows behaved 
like that.

The problem applies not only to Windows, but any API you would imagine.
A common situation is when you need to do your job quickly using only 
some part of a library which otherwise you aren't going to study 
thoroughly, or you want only a proof-of-concept. And if your attempting 
to use something new to you, you do make mistakes, no matter how you are 
convinced yo do not.
If the API is defined not by documentation (which is often a tissue of 
lies, and hardly ever it's unambiguous), but by means of input checking, 
you have benefits in two fields: 1) developers of library had to think 
what they wanted to do, so library probably works, and it's less 
probable that its new versions will break the compatiblity, and 2) users 
of the library will be warned quickly, saving their time.

It's not about messing other processes. It's about saving your time, 
otherwise consumed by effects of common mistakes, misunderstanding the 
documentation, or working in a hurry. And your time costs much more than 
the time of bazillion argument checks.

Jun 21 2010

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Mon, 21 Jun 2010 21:15:12 +0300, Adrian Matoga <epi atari8.info> wrote:

 It was 15 years ago, at the times of 3.x and 95, when Windows behaved  
 like that.

More like 10, Windows Millennium was the last 9x-based Windows operating  
system without strong process isolation.

 The problem applies not only to Windows, but any API you would imagine.
 A common situation is when you need to do your job quickly using only  
 some part of a library which otherwise you aren't going to study  
 thoroughly, or you want only a proof-of-concept. And if your attempting  
 to use something new to you, you do make mistakes, no matter how you are  
 convinced yo do not.
 If the API is defined not by documentation (which is often a tissue of  
 lies, and hardly ever it's unambiguous), but by means of input checking,  
 you have benefits in two fields: 1) developers of library had to think  
 what they wanted to do, so library probably works, and it's less  
 probable that its new versions will break the compatiblity, and 2) users  
 of the library will be warned quickly, saving their time.

 It's not about messing other processes. It's about saving your time,  
 otherwise consumed by effects of common mistakes, misunderstanding the  
 documentation, or working in a hurry. And your time costs much more than  
 the time of bazillion argument checks.

For this particular situation, contracts would be just fine though :)

-- Best regards,
  Vladimir                            mailto:vladimir thecybershadow.net

Jun 21 2010

BCS <none anon.com> writes:

Hello Leandro,

 If the DLL has a bug, it will explode anyways.

A DLL can work just fine (a.k.a. not explode) and still return garbage as 
long as it never depends on not seeing the kind of garbage it's producing. 
Say for instance it's written in D and returns a string with a missing \0 
as a char*/length, the DLL works just fine but a printf blows up.

 You
 are just trying to catch programs bugs in the DLL,

Exactly.

 which seems overly
 patronizing to me.

Um, it doesn't to me.

 Why will you assume I'm so dumb that I won't use your interface correctly?

First because some people are. And second, because it trivially easy to respond 
to support calls that start with "Your DLL is throwing a
YouAreNotUsingThisDLLCorrectlyRTFM 
Exception" <joke/>.

-- 
... <IXOYE><

Jun 20 2010

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Mon, 21 Jun 2010 07:42:34 +0300, BCS <none anon.com> wrote:

 Why will you assume I'm so dumb that I won't use your interface  
 correctly?

 First because some people are. And second, because it trivially easy to  
 respond to support calls that start with "Your DLL is throwing a  
 YouAreNotUsingThisDLLCorrectlyRTFM Exception" <joke/>.

I think that for such situations you should ship a debug and release  
version of the DLL.
This way you don't sacrifice performance when the user doesn't want to be  
held by the hand.

-- Best regards,
  Vladimir                            mailto:vladimir thecybershadow.net

Jun 21 2010

BCS <none anon.com> writes:

Hello Vladimir,

 On Mon, 21 Jun 2010 07:42:34 +0300, BCS <none anon.com> wrote:
 
 Why will you assume I'm so dumb that I won't use your interface
 correctly?
 

 First because some people are. And second, because it trivially easy
 to  respond to support calls that start with "Your DLL is throwing a
 YouAreNotUsingThisDLLCorrectlyRTFM Exception" <joke/>.
 

 I think that for such situations you should ship a debug and release
 version of the DLL.
 This way you don't sacrifice performance when the user doesn't want to
 be
 held by the hand.

Until you can show me a perf problem, I don't see any point in doing that. 
(OTOH, deep structure validation, or anything else slower than O(1), is another 
thing all together)

-- 
... <IXOYE><

Jun 21 2010

"Simen kjaeraas" <simen.kjaras gmail.com> writes:

BCS <none anon.com> wrote:
 I think that for such situations you should ship a debug and release
 version of the DLL.
 This way you don't sacrifice performance when the user doesn't want to
 be
 held by the hand.

 Until you can show me a perf problem, I don't see any point in doing  
 that. (OTOH, deep structure validation, or anything else slower than  
 O(1), is another thing all together)

Also, if you do have two different versions, I'll bet you ready money
someone will program only the release version, because "that's the
version the users will have", "the debug version is slow" or whatever
other inane excuse their minds are capable of coming up with.

-- 
Simen

Jun 22 2010

Sean Kelly <sean invisibleduck.org> writes:

"Simen kjaeraas" <simen.kjaras gmail.com> wrote:
 BCS <none anon.com> wrote:
 I think that for such situations you should ship a debug and release
 version of the DLL.
 This way you don't sacrifice performance when the user doesn't want
 to
 be
 held by the hand.

 
 Until you can show me a perf problem, I don't see any point in doing 
 that. (OTOH, deep structure validation, or anything else slower
 than  > O(1), is another thing all together)


 
 Also, if you do have two different versions, I'll bet you ready money
 someone will program only the release version, because "that's the
 version the users will have", "the debug version is slow" or whatever
 other inane excuse their minds are capable of coming up with.

What I've done with druntime is build checked and unchecked versions.  I
don't think it makes sense to ship a debug version of a library because
that's for debugging the library, not user code. I'll admit that I like
having debug symbols in place though, just not the debug tests
themselves. It's occasionally nice to not have a trace vanish just
because it passes through library code.

Jun 22 2010

Lutger <lutger.blijdestijn gmail.com> writes:

Simen kjaeraas wrote:

 BCS <none anon.com> wrote:
 I think that for such situations you should ship a debug and release
 version of the DLL.
 This way you don't sacrifice performance when the user doesn't want to
 be
 held by the hand.

 Until you can show me a perf problem, I don't see any point in doing
 that. (OTOH, deep structure validation, or anything else slower than
 O(1), is another thing all together)

 
 Also, if you do have two different versions, I'll bet you ready money
 someone will program only the release version, because "that's the
 version the users will have", "the debug version is slow" or whatever
 other inane excuse their minds are capable of coming up with.
 

Naturally, debug is for debugging, not shipping. Instead one could make three 
versions:
- debug  
- release
- unsafe

Or rather let the user compile unsafe themselves if you can distribute the 
source code. 

I am sure most inane users (like me) will choose release.

Jun 22 2010

Norbert Nemec <Norbert Nemec-online.de> writes:

On 20/06/10 22:17, Walter Bright wrote:
 An input to a dll is user input, and should be validated (for the sake
 of security, and other reasons). Validating it is not debugging.

In that case, feel free to compile DLLs with external contract checking 
switched on, but please do not blur the conceptual distinction between 
contracts and exceptions.

You are talking about compiling a libary into a binary DLL that should 
be fit for general usage. In that case, there are good reasons to leave 
the input contract checking active.

In the general case, however, the library user has the control over how 
to compile the library and link to it (just think of inlining). In this 
case, the library user should be allowed to switch off the contract 
checking (at their own risk!)

Conceptually, the ultimate solution would certainly be to place code for 
input contract checking in the *calling* code. After all, this checking 
code serves to debug the calling code, so it should be left to the 
caller to decide whether checking is necessary.

This approach would also allow the compiler to optimize out some checks 
when their correctness can be tested at compile time.

Output contract checks, on the other hand should be compiled inside the 
returning routine.

After all, it is all a matter of trust. A language designer should trust 
the language user to know what he is doing. A library designer should 
trust the library user to act responsibly. After all - if the 
application breaks it is the application designer who has to answer for it.

Jun 28 2010

"Simen kjaeraas" <simen.kjaras gmail.com> writes:

Norbert Nemec <Norbert nemec-online.de> wrote:

 A library designer should trust the library user to act responsibly.
 After all - if the application breaks it is the application designer
 who has to answer for it.

And if the application designer finds that his design breaks due to
a change in the library, he will blame the library designers. If it
is used in a big, well-known and much-used application, the library
designer might have no choice but to continue supporting broken
code.

-- 
Simen

Jun 28 2010

bearophile <bearophileHUGS lycos.com> writes:

Norbert Nemec:
 [...] to place code for input contract checking in the *calling* code. [...]
 Output contract checks, on the other hand should be compiled inside the 
 returning routine.

Is this a positive thing to do? Can this be done? (D must support separate
compilation, but in many situations this is not done/necessary, so maybe in
such situations it can be done). Is Eiffel doing it? if it's a good thing and
it's doable then what kind of changes does it require to the compiler?

Bye,
bearophile

Jun 28 2010

Norbert Nemec <Norbert Nemec-online.de> writes:

On 28/06/10 12:59, bearophile wrote:
 Norbert Nemec:
 [...] to place code for input contract checking in the *calling* code. [...]
 Output contract checks, on the other hand should be compiled inside the
 returning routine.

 Is this a positive thing to do? Can this be done? (D must support separate
compilation, but in many situations this is not done/necessary, so maybe in
such situations it can be done). Is Eiffel doing it? if it's a good thing and
it's doable then what kind of changes does it require to the compiler?

These are good and pragmatic questions that you ask.

The whole issue only arises when doing separate compilation of a library 
and an application. (I use the term "application" for any code that uses 
the library.)

In an idea world (beware, I am switching of "pragmatic thinking mode" 
for a moment), I would describe the situation would as follows:

Either part can be compiled in "debug" mode or in "release" mode. Debug 
mode in the library means that you want to debug the library code 
itself. Release mode in the library means that you trust the library 
code to be correct and switch off all internal checks.

The interesting situation is when you compile the application in debug 
mode, using a stable library compiled in release mode. In this case, the 
input contracts of the library need to be checked to catch bugs in the 
application. Once the testing phase is finished, you want to compile the 
application and get rid of the contract checks.

In this idealized picture, the input contracts should clearly be checked 
in the application code so that the application developer has control 
over them.

Now, for the real world: Code is hardly never 100% bug free. I agree. 
With this argument, however, the concept of debug and release mode 
becomes pointless as you should never switch off the checks anyway. 
Assuming that you are not quite as paranoid you might still reach a 
point where you trust your code to switch off the checks. How about 
input contracts of a library?

As Simen mentioned in his post, there is the issue of library authors 
trying to avoid blame from application authors and the other way around. 
Ultimately, this is exactly what contracts are for: formalize the 
interface as much as possible and make it machine-checkable. If an 
application breaks, compile it in debug mode and see whether it finds 
the problem. With input contracts checked in the calling code, this 
would automatically switch on all the relevant checks to identify bugs 
in the application. If the application designer still tries to blame the 
library, why not simply supply a library compiled in debug mode? Any 
violation of a contract or an assertion unquestionably signals a bug in 
the code.

Can it be done? Certainly. The contract is part of the interface, so 
rather than compiling it into the DLL, it should be stored with the 
interface definitions and left for the application compiler to insert if 
requested. The code for this should not be any more involved than 
function inlining. Effectively, every function with contracts would be 
turned into a wrapper that first checks the contracts and then calls the 
real function in the DLL. The application compiler could then simply 
decide whether to inline the wrapper or simply call the function without 
checks.

I have no idea how Eiffel does it, but I am quite certain that this 
solution is following the original spirit of DbC of Bertrand Meyer.

Greetings,
Norbert

Jun 30 2010

Sean Kelly <sean invisibleduck.org> writes:

Norbert Nemec Wrote:

 On 28/06/10 12:59, bearophile wrote:
 Norbert Nemec:
 [...] to place code for input contract checking in the *calling* code. [...]
 Output contract checks, on the other hand should be compiled inside the
 returning routine.

 Is this a positive thing to do? Can this be done? (D must support separate
compilation, but in many situations this is not done/necessary, so maybe in
such situations it can be done). Is Eiffel doing it? if it's a good thing and
it's doable then what kind of changes does it require to the compiler?

 
 These are good and pragmatic questions that you ask.
 
 The whole issue only arises when doing separate compilation of a library 
 and an application. (I use the term "application" for any code that uses 
 the library.)
 
 In an idea world (beware, I am switching of "pragmatic thinking mode" 
 for a moment), I would describe the situation would as follows:
 
 Either part can be compiled in "debug" mode or in "release" mode. Debug 
 mode in the library means that you want to debug the library code 
 itself. Release mode in the library means that you trust the library 
 code to be correct and switch off all internal checks.

I see the choice of "release" for disabling contracts as a huge mistake in
nomenclature.  For libraries, I would ship a checked and unchecked build (with
-release disabled and enabled), but none with -debug or -unittest set.  Those
are for internal testing and the user shouldn't care to turn on debug code in a
library simply because he's debugging his own app.

The idea of compiling the "in" contract into the application code is an
interesting one, but I suspect it could be tricky.  Consider an unchecked build
of the library, a checked build of the app, and now taking the address of a
library function.  Worse, what if a library routine returns the address of
another library routine?  Now the application has a reference to an unchecked
version of the function, even if the involved technical hurdles are surmounted
(multiple entry points or the like).

This seems like a nice idea but it seems too complicated.  I'd rather just ship
checked and unchecked builds of a library and leave it at that.

Jun 30 2010

Norbert Nemec <Norbert Nemec-online.de> writes:

On 30/06/10 17:45, Sean Kelly wrote:
 Norbert Nemec Wrote:

 On 28/06/10 12:59, bearophile wrote:
 Norbert Nemec:
 [...] to place code for input contract checking in the *calling* code. [...]
 Output contract checks, on the other hand should be compiled inside the
 returning routine.

 Is this a positive thing to do? Can this be done? (D must support separate
compilation, but in many situations this is not done/necessary, so maybe in
such situations it can be done). Is Eiffel doing it? if it's a good thing and
it's doable then what kind of changes does it require to the compiler?

 These are good and pragmatic questions that you ask.

 The whole issue only arises when doing separate compilation of a library
 and an application. (I use the term "application" for any code that uses
 the library.)

 In an idea world (beware, I am switching of "pragmatic thinking mode"
 for a moment), I would describe the situation would as follows:

 Either part can be compiled in "debug" mode or in "release" mode. Debug
 mode in the library means that you want to debug the library code
 itself. Release mode in the library means that you trust the library
 code to be correct and switch off all internal checks.

 I see the choice of "release" for disabling contracts as a huge mistake in
nomenclature.  For libraries, I would ship a checked and unchecked build (with
-release disabled and enabled), but none with -debug or -unittest set.  Those
are for internal testing and the user shouldn't care to turn on debug code in a
library simply because he's debugging his own app.

 The idea of compiling the "in" contract into the application code is an
interesting one, but I suspect it could be tricky.  Consider an unchecked build
of the library, a checked build of the app, and now taking the address of a
library function.  Worse, what if a library routine returns the address of
another library routine?  Now the application has a reference to an unchecked
version of the function, even if the involved technical hurdles are surmounted
(multiple entry points or the like).

That's indeed an interesting aspect: Design by Contract (DbC) and 
function pointers. I am not sure how these concepts would merge properly 
at all.

The contracts are part of the interface, so they should in fact be part 
of the function pointer type! Of course this would quickly become 
ridiculous.

A strongly object oriented language like Eiffel can in principle do 
without function pointers. Instead, one can in most cases use classes 
with virtual functions that offer very similar functionality. A class 
interface comes with all the contracts, so everything is safe and sound.

I really do not know how to deal with function pointers in the clean DbC 
paradigm. If you assign a function with input contracts to a function 
pointer, whoever uses the pointer does not know about the contracts. 
This however, breaks down the strong DbC concept and turns contracts 
into mere run time checks.

Does this mean that D should give up the goal of proper DbC? Simply do 
the pragmatic thing and pick the best pieces from DbC without worrying 
about formal completeness? I guess so...

Jun 30 2010

Jay Byrd <JayByrd rebels.com> writes:

On Wed, 30 Jun 2010 20:03:07 +0100, Norbert Nemec wrote:

 On 30/06/10 17:45, Sean Kelly wrote:
 Norbert Nemec Wrote:

 On 28/06/10 12:59, bearophile wrote:
 Norbert Nemec:
 [...] to place code for input contract checking in the *calling*
 code. [...] Output contract checks, on the other hand should be
 compiled inside the returning routine.

 Is this a positive thing to do? Can this be done? (D must support
 separate compilation, but in many situations this is not
 done/necessary, so maybe in such situations it can be done). Is
 Eiffel doing it? if it's a good thing and it's doable then what kind
 of changes does it require to the compiler?

 These are good and pragmatic questions that you ask.

 The whole issue only arises when doing separate compilation of a
 library and an application. (I use the term "application" for any code
 that uses the library.)

 In an idea world (beware, I am switching of "pragmatic thinking mode"
 for a moment), I would describe the situation would as follows:

 Either part can be compiled in "debug" mode or in "release" mode.
 Debug mode in the library means that you want to debug the library
 code itself. Release mode in the library means that you trust the
 library code to be correct and switch off all internal checks.

 I see the choice of "release" for disabling contracts as a huge mistake
 in nomenclature.  For libraries, I would ship a checked and unchecked
 build (with -release disabled and enabled), but none with -debug or
 -unittest set.  Those are for internal testing and the user shouldn't
 care to turn on debug code in a library simply because he's debugging
 his own app.

 The idea of compiling the "in" contract into the application code is an
 interesting one, but I suspect it could be tricky.  Consider an
 unchecked build of the library, a checked build of the app, and now
 taking the address of a library function.  Worse, what if a library
 routine returns the address of another library routine?  Now the
 application has a reference to an unchecked version of the function,
 even if the involved technical hurdles are surmounted (multiple entry
 points or the like).

 
 That's indeed an interesting aspect: Design by Contract (DbC) and
 function pointers. I am not sure how these concepts would merge properly
 at all.
 
 The contracts are part of the interface, so they should in fact be part
 of the function pointer type! Of course this would quickly become
 ridiculous.
 
 A strongly object oriented language like Eiffel can in principle do
 without function pointers. Instead, one can in most cases use classes
 with virtual functions that offer very similar functionality. A class
 interface comes with all the contracts, so everything is safe and sound.
 
 I really do not know how to deal with function pointers in the clean DbC
 paradigm. If you assign a function with input contracts to a function
 pointer, whoever uses the pointer does not know about the contracts.
 This however, breaks down the strong DbC concept and turns contracts
 into mere run time checks.
 
 Does this mean that D should give up the goal of proper DbC? Simply do
 the pragmatic thing and pick the best pieces from DbC without worrying
 about formal completeness? I guess so...

This is all very confused, and is reflected in D implementing contracts 
all wrong. Contracts do not belong to function pointers or any other 
dynamic state -- they apply to the invoker, and thus the static type. 
Isn't that obvious? If I have Foo f = getSomeFoo(); result = f.method
(args), the args must satisfy the contract for Foo.method(), regardless 
of what method() in the object returned by getSomeFoo() is willing to 
accept (it must, of course, not require more than Foo.method() does; TDPL 
at least gets that right). And the guarantees on result must be those 
promised by Foo.method(), not some much weaker promise given by method() 
in its base class (and there is no need to check stronger guarantees made 
by method() the actual derived object, since the caller didn't ask for 
them).

The D implementation works much too hard to get the wrong result, both in 
complexities of the compiler and in executing a bunch of irrelevant code 
that is ignored if it fails in preconditions or succeeds in 
postconditions, and after all that it fails to enforce requirements it 
should and to guarantee promises that it should.

-- JB

Sep 10 2010

bearophile <bearophileHUGS lycos.com> writes:

Jay Byrd:
 This is all very confused, and is reflected in D implementing contracts all
wrong.

If you know well the ideas of DbC, and you think there are some problems in the
DbC of D2, then I suggest you to not just write what's wrong, why it is wrong
and what bad things such wrong design may cause, and what you suggest to
change, starting from the most important changes.

Maybe Walter will change nothing, but if you write just a little rant where you
say that all is wrong, probably nothing will change, and what you have written
is useless.

Bye,
bearophile

Sep 11 2010

bearophile <bearophileHUGS lycos.com> writes:

 then I suggest you to not just write what's...

Ignore that 'not', please.

Sep 11 2010

retard <re tard.com.invalid> writes:

Sat, 11 Sep 2010 07:16:56 -0400, bearophile wrote:

 Jay Byrd:
 This is all very confused, and is reflected in D implementing contracts
 all wrong.

 
 If you know well the ideas of DbC, and you think there are some problems
 in the DbC of D2, then I suggest you to not just write what's wrong, why
 it is wrong and what bad things such wrong design may cause, and what
 you suggest to change, starting from the most important changes.
 
 Maybe Walter will change nothing, but if you write just a little rant
 where you say that all is wrong, probably nothing will change, and what
 you have written is useless.
 
 Bye,
 bearophile

People are doing it wrong. They shouldn't come and rant here. They should 
write compiler patches instead. All discussion is bad, real code matters.

Sep 11 2010

Norbert Nemec <Norbert Nemec-online.de> writes:

On 11/09/10 08:18, Jay Byrd wrote:
 Contracts do not belong to function pointers or any other
 dynamic state -- they apply to the invoker, and thus the static type.
 Isn't that obvious?

In fact, it is yet one step more complex than that: as the name itself 
suggests, contracts are "between" the caller (static type) and the 
callee (dynamic type). Theoretically, the type system has to ensure that 
both sides are able to fulfill their part of the contract.

The dynamic type must be a subtype of the static type (including 
equality). In subtyping, in-contracts may be weakened, out-contracts may 
be strengthened (in other words, a subtype may require less and promise 
more).

This is all fine, theoretically sound and easy to handle in a clean way 
for object oriented design as it is done in Eiffel.

The complication in D are function pointers and delegates (FP/DG). For a 
clean design, the type of a FP/DG would need to include contract 
information. Contracts are part of the interface and a FP/DG would have 
to include this. Obviously, this would make FP/DG syntax rather awkward. 
Furthermore, FP/DG assignments would need to be type-checked at compile 
time, so contract compatibility would have to be checked at compile time 
as well. This would be completely impossible.

I conclude that within pure OOP, contracts can have strong compile-time 
support. In-contracts should be checked by the caller, out-contracts by 
the callee and both checks could be eliminated if the compiler can 
verify at compile time that they are fulfilled. With FP/DG, this breaks 
down and I believe the best one can do is to implement contracts as 
run-time checks in the callee, just as it is done in D.

The only detail that I would wish for is a more fine-grained tuning of 
DbC contract checks in the compiler and a clearer conceptual separation 
of the concepts of assertions and contracts. However, the former is an 
implementation detail and the latter has been discussed to death before.

Sep 11 2010

bearophile <bearophileHUGS lycos.com> writes:

Norbert Nemec:

Thank you for your clear explanations of the situation.

 With FP/DG, this breaks 
 down and I believe the best one can do is to implement contracts as 
 run-time checks in the callee, just as it is done in D.

There is also a mixed strategy: to use run-time checks in the callee for FP/DG,
and something better for the other situations (FP/DG are present in D2
programs, but they aren't used everywhere).

Bye,
bearophile

Sep 12 2010

Norbert Nemec <Norbert Nemec-online.de> writes:

On 12/09/10 14:48, bearophile wrote:
 Norbert Nemec:

 Thank you for your clear explanations of the situation.

 With FP/DG, this breaks
 down and I believe the best one can do is to implement contracts as
 run-time checks in the callee, just as it is done in D.

 There is also a mixed strategy: to use run-time checks in the callee for
FP/DG, and something better for the other situations (FP/DG are present in D2
programs, but they aren't used everywhere).

Indeed - this would mean a bare, unchecked interface for each function 
and a wrapper that adds contract checks. If the calling code can verify 
the contract at compile time, it may call the unchecked version. 
Otherwise (like with FP/DG), it will call the checked wrapper.

In fact, this could be understood like a kind of type-casting: A 
function interface includes the contract as part of the type 
information. If the contract is "casted" away (by assigning the function 
to a FP that does not include a contract) the the FP points to the 
run-time checked version of the routine.

This concept may even open the road towards FP/DG that include contract 
information without getting in the way of lightweight, contract-free 
FP/DGs as we have them now.

Greetings,
Norbert

Sep 12 2010

"Danny Wilson" <danny decube.net> writes:

Op Mon, 28 Jun 2010 10:33:24 +0200 schreef Norbert Nemec  
<Norbert nemec-online.de>:

 Conceptually, the ultimate solution would certainly be to place code for  
 input contract checking in the *calling* code. After all, this checking  
 code serves to debug the calling code, so it should be left to the  
 caller to decide whether checking is necessary.

I like this idea.

 This approach would also allow the compiler to optimize out some checks  
 when their correctness can be tested at compile time.

 Output contract checks, on the other hand should be compiled inside the  
 returning routine.

 After all, it is all a matter of trust. A language designer should trust  
 the language user to know what he is doing. A library designer should  
 trust the library user to act responsibly. After all - if the  
 application breaks it is the application designer who has to answer for  
 it.



Alot. :-)

Jun 28 2010

Norbert Nemec <Norbert Nemec-online.de> writes:

On 19/06/10 22:46, Andrei Alexandrescu wrote:
 On 06/19/2010 03:55 PM, bearophile wrote:
 Inside Phobos2 I have counted about 160 usages of the "body" keyword.
 I think contract programming can be used more often inside Phobos2
 (and maybe some usages of enforce() can be turned into contract
 programming because they are more similar to program sanity checks).

 Walter and I discussed this and concluded that Phobos should handle its
 parameters as user input. Therefore they need to be scrubbed with hard
 tests, not contracts.

IMHO, this is plain wrong!

By this kind of decision, you are putting the library user under 
tutelage. The D language is explicitly designed to allow the user to 
take off the safety belt, but to do so at their own responsibility. I 
typically get very annoyed out of principle if I explicitly and 
consciously switch *off* a safety feature and someone decides that they 
do not trust that I am mature enough for that decision and leave part of 
the safety checks in place.

The interface between parts of a program is exactly what contracts are 
designed for. Calling a function with incorrect arguments is a bug in 
the calling code. It should be caught at the interface by a contract 
violation. If the library user trusts their program enough to switch off 
contract checking, the library designer should not worry about 
double-checking for incorrect function calls.

Contracts are part of the public interface, so whatever is specified 
there is automatically documented for the world to see. Replacing 
contracts by "enforce" statements inside the library functions means 
that you have to document in prose what kind of input arguments are allowed.

There are exceptional cases when exceptions may be thrown for incorrect 
arguments: typically, whenever testing for input correctness would be 
too costly, e.g. for badly conditioned matrices in linear algebra code. 
Here, the problem typically shows up during the calculation when it is 
too late to issue a contract violation.

Otherwise, exceptions in library code should only happen in well-defined 
cases for run-time conditions (like I/O errors).

Sorry about by blunt words, but I feel that behind this issue there 
still is a rather fundamental misunderstanding of the concepts of 
contract programming.

Jun 28 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 06/28/2010 03:15 AM, Norbert Nemec wrote:
 On 19/06/10 22:46, Andrei Alexandrescu wrote:
 On 06/19/2010 03:55 PM, bearophile wrote:
 Inside Phobos2 I have counted about 160 usages of the "body" keyword.
 I think contract programming can be used more often inside Phobos2
 (and maybe some usages of enforce() can be turned into contract
 programming because they are more similar to program sanity checks).

 Walter and I discussed this and concluded that Phobos should handle its
 parameters as user input. Therefore they need to be scrubbed with hard
 tests, not contracts.

 IMHO, this is plain wrong!

 By this kind of decision, you are putting the library user under
 tutelage. The D language is explicitly designed to allow the user to
 take off the safety belt, but to do so at their own responsibility.

C APIs also check their arguments.

Andrei

Jun 28 2010

bearophile <bearophileHUGS lycos.com> writes:

Andrei Alexandrescu:
 C APIs also check their arguments.

Try again, C doesn't have DbC :-) Norbert Nemec says some good things.

Bye,
bearophile

Jun 28 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

bearophile wrote:
 Andrei Alexandrescu:
 C APIs also check their arguments.

 
 Try again, C doesn't have DbC :-)

What I meant to say was that even the standard library of a language 
famous for its to-the-metal performance still checks parameters 
compulsively whenever it can. Search e.g. this page:

http://google.com/codesearch/p?hl=en#XAzRy8oK4zA/libc/stdio/fseek.c&q=fseek&sa=N&cd=2&ct=rc

for "whence".

 Norbert Nemec says some good things.

I think it's an area where reasonable people may disagree.


Andrei

Jun 28 2010

Norbert Nemec <Norbert Nemec-online.de> writes:

On 28/06/10 19:30, Andrei Alexandrescu wrote:
 bearophile wrote:
 Andrei Alexandrescu:
 C APIs also check their arguments.

 Try again, C doesn't have DbC :-)

 What I meant to say was that even the standard library of a language
 famous for its to-the-metal performance still checks parameters
 compulsively whenever it can. [...]

Indeed, checking input arguments is essential. DbC simply means 
formalizing what has been done in any good libary for ages.

My only intention was to make clear that checking input arguments is 
exactly what contracts are designed for. If the Phobos designers are 
worried that their input checks might be deactivated prematurely, we 
should fix the policy for (de/)activating the input contract checks 
rather than avoiding the use of input contracts they way they are 
intended to be used.

"enforce" simply is not the right tool for this purpose: When a contract 
or and assertion violation is found, it is clear that there is a bug in 
the code. Furthermore, it is even clear which portion of the code is 
responsible for the bug. An assertion violation inside a library is a 
bug in this library. This bug may simply be a missing input contract, 
but this still is a bug in the library.

An "enforce" violation in the library may be anything. The user has to 
dig into the library code to find out whether it is a library bug or a 
piece of incorrect input.

Raising a well-defined exception gives enough information about the 
problem. An anonymous "enforce" on the other hand is a quick-and-dirty 
solution that does not help the developer very much to identify the real 
problem.


 Norbert Nemec says some good things.

 I think it's an area where reasonable people may disagree.

Thanks bearophile, thanks Andrei as well -- I really appreciate this 
open exchange of ideas. Feel free to shoot back as directly as I 
attacked... :-)

Jun 30 2010

Michel Fortin <michel.fortin michelf.com> writes:

On 2010-06-28 07:17:53 -0400, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 On 06/28/2010 03:15 AM, Norbert Nemec wrote:
 On 19/06/10 22:46, Andrei Alexandrescu wrote:
 On 06/19/2010 03:55 PM, bearophile wrote:
 Inside Phobos2 I have counted about 160 usages of the "body" keyword.
 I think contract programming can be used more often inside Phobos2
 (and maybe some usages of enforce() can be turned into contract
 programming because they are more similar to program sanity checks).

 
 Walter and I discussed this and concluded that Phobos should handle its
 parameters as user input. Therefore they need to be scrubbed with hard
 tests, not contracts.

 
 IMHO, this is plain wrong!
 
 By this kind of decision, you are putting the library user under
 tutelage. The D language is explicitly designed to allow the user to
 take off the safety belt, but to do so at their own responsibility.

 
 C APIs also check their arguments.

With C you don't have the option to turn the checks on or off. It's 
generally better to have them when you don't need them than not have 
them when you need them. With D, you can turn them on or off on a whim.

If the 'in' contract was enforced at the call site instead of inside 
the function, it'd be up to the one using a function to decide whether 
to check contracts or not. That's not an option in C, but it could work 
like that in D...

I agree though that with the way contracts are currently implemented 
this doesn't work very well. You have to recompile the library with 
contracts on, which in turn forces all the internal contracts inside 
the library to be evaluated. All this because you're trying to validate 
inputs you give to that library? Doesn't make sense. In that context I 
agree that checking explicitly at the library boundaries might be a 
more viable option (like in C).

-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Jun 28 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Michel Fortin wrote:
 On 2010-06-28 07:17:53 -0400, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org> said:
 
 On 06/28/2010 03:15 AM, Norbert Nemec wrote:
 On 19/06/10 22:46, Andrei Alexandrescu wrote:
 On 06/19/2010 03:55 PM, bearophile wrote:
 Inside Phobos2 I have counted about 160 usages of the "body" keyword.
 I think contract programming can be used more often inside Phobos2
 (and maybe some usages of enforce() can be turned into contract
 programming because they are more similar to program sanity checks).

 Walter and I discussed this and concluded that Phobos should handle its
 parameters as user input. Therefore they need to be scrubbed with hard
 tests, not contracts.

 IMHO, this is plain wrong!

 By this kind of decision, you are putting the library user under
 tutelage. The D language is explicitly designed to allow the user to
 take off the safety belt, but to do so at their own responsibility.

 C APIs also check their arguments.

 
 With C you don't have the option to turn the checks on or off.

#define NDEBUG


Andrei

Jun 28 2010

Sean Kelly <sean invisibleduck.org> writes:

Andrei Alexandrescu Wrote:
 
 C APIs also check their arguments.

Not the standard C library, as far as I know.  Of course, it's also gotten a
lot of flak for this.

Jun 28 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Sean Kelly wrote:
 Andrei Alexandrescu Wrote:
 C APIs also check their arguments.

 
 Not the standard C library, as far as I know.  Of course, it's also gotten a
lot of flak for this.

Nonono. They check whenever they can. Oftentimes they're unable to check.

Example: fseek checks its whence parameter (mentioned in my previous 
post) but itoa cannot check that the target is a valid memory buffer of 
the appropriate length.


Andrei

Jun 28 2010

D Programming

C/C++ Programming

Other

digitalmars.D - enforce()?