www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - [GSOC] regular expressions beta is here

reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
     In case I failed to mention it before, I m working on the project 
codenamed FReD that is aimed at ~100%* source level compatible overhaul 
of std.regex, that uses better implementation techniques, provides 
modern Unicode support and common syntax riches.

     I think it's time for a public beta release,  since it _should_ be 
ready for mainstream usage. There are some rough edges, and a couple 
issues that I'm aware of but they are nowhere in realistic use cases.

     In order to avoid unexpected regressions I'd be glad if current 
std.regex users do try it for their projects/tests.
To get a small no-crap-included beta package see download section of 
https://github.com/blackwhale/FReD for .7zs.
I'll upload newer packages as bugs get exposed and fixed. Alternatively, 
if you a comfortable with git you may just git clone entire repo. Some 
helpful notes (same as README) can be found here : 
https://github.com/blackwhale/FReD/wiki/Beta-release

Caveats:
     In order for it compile a tiny change to 2.054 source is needed (no 
need to recompile Phobos! it's only in templates):
patch std.algorithm.cmp according to this diff 
https://github.com/D-Programming-Language/phobos/pull/176/files#L0L4631 
<https://github.com/D-Programming-Language/phobos/pull/176/files#L0L4633>
and to get CTFE features working add if(!__ctfe) listed in the next diff 
on the same webpage.
(this is already upstream, so if you're using a fork of phobos just pull 
this in)

* some API problems might lead to a breaking change, though it didn't 
happen in this release

-- 
Dmitry Olshansky
Aug 10 2011
next sibling parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Wed, 10 Aug 2011 13:42:25 +0300, Dmitry Olshansky  
<dmitry.olsh gmail.com> wrote:

 and to get CTFE features working add if(!__ctfe) listed in the next diff

Hi, does this rewrite cover compile-time regex compilation? E.g. regex!`^a` compiling to s.length&&s[0]=='a' or something like that. -- Best regards, Vladimir mailto:vladimir thecybershadow.net
Aug 10 2011
parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 10.08.2011 15:16, Vladimir Panteleev wrote:
 On Wed, 10 Aug 2011 13:42:25 +0300, Dmitry Olshansky 
 <dmitry.olsh gmail.com> wrote:

 and to get CTFE features working add if(!__ctfe) listed in the next diff

Hi, does this rewrite cover compile-time regex compilation? E.g. regex!`^a` compiling to s.length&&s[0]=='a' or something like that.

Yes, I've dubbed it static regex. In fact it will be something similar to this, though it will do a heap allocation for backtracking points, on first call to match. Heap allocations are definetly going away in final release. You can pass -version=fred_ct -debug to dmd to see generated programs. At the moment it's more prof of concept then speed devil, something I might see about to change once CTFE bugs worked out. Anyway when it doesn't crush the compiler, it's pretty fast :) -- Dmitry Olshansky
Aug 10 2011
prev sibling next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2011-08-10 12:42, Dmitry Olshansky wrote:
 In case I failed to mention it before, I m working on the project
 codenamed FReD that is aimed at ~100%* source level compatible overhaul
 of std.regex, that uses better implementation techniques, provides
 modern Unicode support and common syntax riches.

 I think it's time for a public beta release, since it _should_ be ready
 for mainstream usage. There are some rough edges, and a couple issues
 that I'm aware of but they are nowhere in realistic use cases.

 In order to avoid unexpected regressions I'd be glad if current
 std.regex users do try it for their projects/tests.
 To get a small no-crap-included beta package see download section of
 https://github.com/blackwhale/FReD for .7zs.
 I'll upload newer packages as bugs get exposed and fixed. Alternatively,
 if you a comfortable with git you may just git clone entire repo. Some
 helpful notes (same as README) can be found here :
 https://github.com/blackwhale/FReD/wiki/Beta-release

 Caveats:
 In order for it compile a tiny change to 2.054 source is needed (no need
 to recompile Phobos! it's only in templates):
 patch std.algorithm.cmp according to this diff
 https://github.com/D-Programming-Language/phobos/pull/176/files#L0L4631
 <https://github.com/D-Programming-Language/phobos/pull/176/files#L0L4633>
 and to get CTFE features working add if(!__ctfe) listed in the next diff
 on the same webpage.
 (this is already upstream, so if you're using a fork of phobos just pull
 this in)

 * some API problems might lead to a breaking change, though it didn't
 happen in this release

I have a suggestion, make RegexMatch implicitly convertible to bool, indicating if there was a match or not. Aren't there a lot of things that should be declared as private in the fred.d module? -- /Jacob Carlborg
Aug 10 2011
next sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 10.08.2011 15:34, Jacob Carlborg wrote:
 On 2011-08-10 12:42, Dmitry Olshansky wrote:
 In case I failed to mention it before, I m working on the project
 codenamed FReD that is aimed at ~100%* source level compatible overhaul
 of std.regex, that uses better implementation techniques, provides
 modern Unicode support and common syntax riches.

 I think it's time for a public beta release, since it _should_ be ready
 for mainstream usage. There are some rough edges, and a couple issues
 that I'm aware of but they are nowhere in realistic use cases.

 In order to avoid unexpected regressions I'd be glad if current
 std.regex users do try it for their projects/tests.
 To get a small no-crap-included beta package see download section of
 https://github.com/blackwhale/FReD for .7zs.
 I'll upload newer packages as bugs get exposed and fixed. Alternatively,
 if you a comfortable with git you may just git clone entire repo. Some
 helpful notes (same as README) can be found here :
 https://github.com/blackwhale/FReD/wiki/Beta-release

 Caveats:
 In order for it compile a tiny change to 2.054 source is needed (no need
 to recompile Phobos! it's only in templates):
 patch std.algorithm.cmp according to this diff
 https://github.com/D-Programming-Language/phobos/pull/176/files#L0L4631
 <https://github.com/D-Programming-Language/phobos/pull/176/files#L0L4633> 

 and to get CTFE features working add if(!__ctfe) listed in the next diff
 on the same webpage.
 (this is already upstream, so if you're using a fork of phobos just pull
 this in)

 * some API problems might lead to a breaking change, though it didn't
 happen in this release

I have a suggestion, make RegexMatch implicitly convertible to bool, indicating if there was a match or not.

auto m = match("bleh", "bleh"); writeln(m); to actually print "bleh", not true Right now due to a carry over bug from std.regex (interface thing) writln(m) will just do a stackoverflow, m.hit however works.
 Aren't there a lot of things that should be declared as private in the 
 fred.d module?

Yes, it's a side effect of me having a lot of debugging tool that do need these internals. If only package protection attribute of something was working.... Not to mention that the whole module should work in SafeD with a couple of trusted here and there. -- Dmitry Olshansky
Aug 10 2011
next sibling parent reply Jacob Carlborg <doob me.com> writes:
 Interesting idea, one problem with it is that I want this:

 auto m = match("bleh", "bleh");
 writeln(m);

 to actually print "bleh", not true
 Right now due to a carry over bug from std.regex (interface thing)
 writln(m) will just do a stackoverflow, m.hit however works.

No, that won't be any problem: struct Foo { bool b; alias b this; } auto f = Foo(); static assert(is(typeof(f) == Foo)); The above assert passes as expected.
 Aren't there a lot of things that should be declared as private in the
 fred.d module?

Yes, it's a side effect of me having a lot of debugging tool that do need these internals. If only package protection attribute of something was working.... Not to mention that the whole module should work in SafeD with a couple of trusted here and there.

Ok, I see. -- /Jacob Carlborg
Aug 10 2011
parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 10.08.2011 18:54, Jacob Carlborg wrote:
 Interesting idea, one problem with it is that I want this:

 auto m = match("bleh", "bleh");
 writeln(m);

 to actually print "bleh", not true
 Right now due to a carry over bug from std.regex (interface thing)
 writln(m) will just do a stackoverflow, m.hit however works.

No, that won't be any problem: struct Foo { bool b; alias b this; } auto f = Foo(); static assert(is(typeof(f) == Foo)); The above assert passes as expected.

After some experience with alias this I had to conclude that it's rather blunt tool, and I'd rather stay away of it. Actually I like Steven's opCast suggestion, so that it works in conditionals.
 Aren't there a lot of things that should be declared as private in the
 fred.d module?

Yes, it's a side effect of me having a lot of debugging tool that do need these internals. If only package protection attribute of something was working.... Not to mention that the whole module should work in SafeD with a couple of trusted here and there.

Ok, I see.

-- Dmitry Olshansky
Aug 10 2011
next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2011-08-10 17:55, Dmitry Olshansky wrote:
 On 10.08.2011 18:54, Jacob Carlborg wrote:
 Interesting idea, one problem with it is that I want this:

 auto m = match("bleh", "bleh");
 writeln(m);

 to actually print "bleh", not true
 Right now due to a carry over bug from std.regex (interface thing)
 writln(m) will just do a stackoverflow, m.hit however works.

No, that won't be any problem: struct Foo { bool b; alias b this; } auto f = Foo(); static assert(is(typeof(f) == Foo)); The above assert passes as expected.


Hmm, it doesn't print anything, I think it looks like a bug in writeln.
 After some experience with alias this I had to conclude that it's rather
 blunt tool, and I'd rather stay away of it.
 Actually I like Steven's opCast suggestion, so that it works in
 conditionals.

Oh, I didn't know that it would work implicitly in conditionals. Then I'm happy with opCast :) -- /Jacob Carlborg
Aug 10 2011
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 8/10/11 10:46 AM, Jacob Carlborg wrote:
 On 2011-08-10 17:55, Dmitry Olshansky wrote:
 On 10.08.2011 18:54, Jacob Carlborg wrote:
 Interesting idea, one problem with it is that I want this:

 auto m = match("bleh", "bleh");
 writeln(m);

 to actually print "bleh", not true
 Right now due to a carry over bug from std.regex (interface thing)
 writln(m) will just do a stackoverflow, m.hit however works.

No, that won't be any problem: struct Foo { bool b; alias b this; } auto f = Foo(); static assert(is(typeof(f) == Foo)); The above assert passes as expected.


Hmm, it doesn't print anything, I think it looks like a bug in writeln.
 After some experience with alias this I had to conclude that it's rather
 blunt tool, and I'd rather stay away of it.
 Actually I like Steven's opCast suggestion, so that it works in
 conditionals.

Oh, I didn't know that it would work implicitly in conditionals. Then I'm happy with opCast :)

That's pretty cool actually because it naturally extends the built-in approach. When you do e.g. if (pointer) that's really equivalent to if (cast(bool) pointer) and so on. Andrei
Aug 10 2011
parent Jacob Carlborg <doob me.com> writes:
On 2011-08-10 19:45, Andrei Alexandrescu wrote:
 That's pretty cool actually because it naturally extends the built-in
 approach. When you do e.g. if (pointer) that's really equivalent to if
 (cast(bool) pointer) and so on.

 Andrei

Cool, I always thought that opCast was for explicit casts, but maybe it's explicit in this case, in some way. -- /Jacob Carlborg
Aug 10 2011
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 8/10/11 9:55 AM, Dmitry Olshansky wrote:
 On 10.08.2011 18:54, Jacob Carlborg wrote:
 Interesting idea, one problem with it is that I want this:

 auto m = match("bleh", "bleh");
 writeln(m);

 to actually print "bleh", not true
 Right now due to a carry over bug from std.regex (interface thing)
 writln(m) will just do a stackoverflow, m.hit however works.

No, that won't be any problem: struct Foo { bool b; alias b this; } auto f = Foo(); static assert(is(typeof(f) == Foo)); The above assert passes as expected.

After some experience with alias this I had to conclude that it's rather blunt tool, and I'd rather stay away of it.

If alias this is any more blunt than regular subtyping (inheritance), that would be a bug. Feel free to submit if you find such issues. Andrei
Aug 10 2011
prev sibling parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 10.08.2011 16:54, Steven Schveighoffer wrote:
 On Wed, 10 Aug 2011 07:51:32 -0400, Dmitry Olshansky 
 <dmitry.olsh gmail.com> wrote:

 On 10.08.2011 15:34, Jacob Carlborg wrote:
 On 2011-08-10 12:42, Dmitry Olshansky wrote:
 In case I failed to mention it before, I m working on the project
 codenamed FReD that is aimed at ~100%* source level compatible 
 overhaul
 of std.regex, that uses better implementation techniques, provides
 modern Unicode support and common syntax riches.

 I think it's time for a public beta release, since it _should_ be 
 ready
 for mainstream usage. There are some rough edges, and a couple issues
 that I'm aware of but they are nowhere in realistic use cases.

 In order to avoid unexpected regressions I'd be glad if current
 std.regex users do try it for their projects/tests.
 To get a small no-crap-included beta package see download section of
 https://github.com/blackwhale/FReD for .7zs.
 I'll upload newer packages as bugs get exposed and fixed. 
 Alternatively,
 if you a comfortable with git you may just git clone entire repo. Some
 helpful notes (same as README) can be found here :
 https://github.com/blackwhale/FReD/wiki/Beta-release

 Caveats:
 In order for it compile a tiny change to 2.054 source is needed (no 
 need
 to recompile Phobos! it's only in templates):
 patch std.algorithm.cmp according to this diff
 https://github.com/D-Programming-Language/phobos/pull/176/files#L0L4631 

 <https://github.com/D-Programming-Language/phobos/pull/176/files#L0L4633> 
 and to get CTFE features working add if(!__ctfe) listed in the next 
 diff
 on the same webpage.
 (this is already upstream, so if you're using a fork of phobos just 
 pull
 this in)

 * some API problems might lead to a breaking change, though it didn't
 happen in this release

I have a suggestion, make RegexMatch implicitly convertible to bool, indicating if there was a match or not.

auto m = match("bleh", "bleh"); writeln(m); to actually print "bleh", not true

Without actually looking at the code, why wouldn't something like this work? struct RegexMatch { ... string toString() {...} opCast(T : bool)() {...} } This isn't an implicit cast, but it will work for conditional statements.

-- Dmitry Olshansky
Aug 10 2011
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 10 Aug 2011 07:51:32 -0400, Dmitry Olshansky  
<dmitry.olsh gmail.com> wrote:

 On 10.08.2011 15:34, Jacob Carlborg wrote:
 On 2011-08-10 12:42, Dmitry Olshansky wrote:
 In case I failed to mention it before, I m working on the project
 codenamed FReD that is aimed at ~100%* source level compatible overhaul
 of std.regex, that uses better implementation techniques, provides
 modern Unicode support and common syntax riches.

 I think it's time for a public beta release, since it _should_ be ready
 for mainstream usage. There are some rough edges, and a couple issues
 that I'm aware of but they are nowhere in realistic use cases.

 In order to avoid unexpected regressions I'd be glad if current
 std.regex users do try it for their projects/tests.
 To get a small no-crap-included beta package see download section of
 https://github.com/blackwhale/FReD for .7zs.
 I'll upload newer packages as bugs get exposed and fixed.  
 Alternatively,
 if you a comfortable with git you may just git clone entire repo. Some
 helpful notes (same as README) can be found here :
 https://github.com/blackwhale/FReD/wiki/Beta-release

 Caveats:
 In order for it compile a tiny change to 2.054 source is needed (no  
 need
 to recompile Phobos! it's only in templates):
 patch std.algorithm.cmp according to this diff
 https://github.com/D-Programming-Language/phobos/pull/176/files#L0L4631
 <https://github.com/D-Programming-Language/phobos/pull/176/files#L0L4633>  
 and to get CTFE features working add if(!__ctfe) listed in the next  
 diff
 on the same webpage.
 (this is already upstream, so if you're using a fork of phobos just  
 pull
 this in)

 * some API problems might lead to a breaking change, though it didn't
 happen in this release

I have a suggestion, make RegexMatch implicitly convertible to bool, indicating if there was a match or not.

auto m = match("bleh", "bleh"); writeln(m); to actually print "bleh", not true

Without actually looking at the code, why wouldn't something like this work? struct RegexMatch { ... string toString() {...} opCast(T : bool)() {...} } This isn't an implicit cast, but it will work for conditional statements. -Steve
Aug 10 2011
prev sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 10 Aug 2011 12:46:25 -0400, Jacob Carlborg <doob me.com> wrote:

 On 2011-08-10 17:55, Dmitry Olshansky wrote:

 After some experience with alias this I had to conclude that it's rather
 blunt tool, and I'd rather stay away of it.
 Actually I like Steven's opCast suggestion, so that it works in
 conditionals.


alias this has lots of problems, but it doesn't mean it's *design* is blunt, just that the implementation of it is not too good.
 Oh, I didn't know that it would work implicitly in conditionals. Then  
 I'm happy with opCast :)

http://www.d-programming-language.org/operatoroverloading.html#Cast Note that it only works for structs (not sure if that return type is a struct or not...) -Steve
Aug 10 2011
prev sibling next sibling parent "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Wed, 10 Aug 2011 14:44:44 +0300, Dmitry Olshansky  
<dmitry.olsh gmail.com> wrote:

 Yes, I've dubbed it  static regex. In fact it will be something similar  
 to this, though it will do a heap allocation for backtracking points, on  
 first call to match. Heap allocations are definetly going away in final  
 release.

Awesome stuff. D's codegen abilities have the potential to put regex matching way ahead of any C/C++ libraries that don't JIT or stuff like that. -- Best regards, Vladimir mailto:vladimir thecybershadow.net
Aug 10 2011
prev sibling next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Dmitry Olshansky:

 To get a small no-crap-included beta package see download section of 
 https://github.com/blackwhale/FReD for .7zs.

When you write some English text you don't write a single block of text, you organize it into paragraphs, and paragraphs into chapters, chapters into sections, sections into books, etc. Time ago I have understood that paragraphs are very good in source code too. So I suggest you to add a blank line here and there inside your functions to separate them into paragraphs. I can't give you a style rule, you will need to create your own style, but often a function that's more than 10 lines line long needs one or more blank lines inside (some people say that every time you see one of such paragraphs in a function, especially if it has a comment before it, then you need to perform an "extract method" to improve the code. I believe this is a bad advice). I see no contracts in the code (I mean the ones with assert inside, instead of enforce). I suggest Walter to fix this situation. One idea is to include two versions of Phobos lib in the zip of the dmd distribution, one with asserts compiled in and one without, and let DMD import from the correct library according to the compilation flags. Some solution to this problem is getting urgent, because Phobos is growing without the use of one of the nicest features of D (contract programming). Solving this problem is more urgent than having an excellent regex library in Phobos. If people don't use contract programming much, is because you can't use it in Phobos. Bye, bearophile
Aug 10 2011
next sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 10.08.2011 20:02, bearophile wrote:
 Dmitry Olshansky:

 To get a small no-crap-included beta package see download section of
 https://github.com/blackwhale/FReD for .7zs.

So I suggest you to add a blank line here and there inside your functions to separate them into paragraphs. I can't give you a style rule, you will need to create your own style, but often a function that's more than 10 lines line long needs one or more blank lines inside (some people say that every time you see one of such paragraphs in a function, especially if it has a comment before it, then you need to perform an "extract method" to improve the code. I believe this is a bad advice).

While I haven't asked for review, I do appreciate comments. I have to say I did no cleanup or otherwise shape up the code, I'm still working on semantic side part of problems:) Honestly I can't get why you are so nervous about code style anyway, you seem to bring this up way to often. About spaces personally I dislike eating extra vertical space for "clarity", curly braces on it's own line is already way too much.
 I see no contracts in the code (I mean the ones with assert inside, instead of
enforce). I suggest Walter to fix this situation. One idea is to include two
versions of Phobos lib in the zip of the dmd distribution, one with asserts
compiled in and one without, and let DMD import from the correct library
according to the compilation flags.
 Some solution to this problem is getting urgent, because Phobos is growing
without the use of one of the nicest features of D (contract programming).
Solving this problem is more urgent than having an excellent regex library in
Phobos. If people don't use contract programming much, is because you can't use
it in Phobos.

contracts. They are nice but have little value over plain assert _unless_ we are talking about classes and _inheritance_, which isn't the case here. And there are lots of asserts here, but much more of input is enforced since it's totally expected to supply wrong pattern (or have an outside user to type in the pattern).
 Bye,
 bearophile

-- Dmitry Olshansky
Aug 10 2011
parent reply bearophile <bearophileHUGS lycos.com> writes:
Dmitry Olshansky:

 Honestly I can't get why you are so nervous about code style anyway, you 
 seem to bring this up way to often.

I bring it often because many D programmers seem half blind to this problem. I am not willing to go to the extremes Go language goes to solve this problem, but I'd like more recognition of this problem in D programmers. A bit more common style is quite helpful to create an ecology of D programmers that share single modules. I guess D programmers are used to C/C++ languages, where there are not modules and where programs are usually made of many files. So they don't see why sharing single modules in the pool is so useful.
 About spaces personally I dislike eating extra vertical space for 
 "clarity", curly braces on it's own line is already way too much.

Think about reading a book without the half lines between paragraphs. In code it's the same. Some empty lines are good to improve readability of the code. Curly braces are not always present, sometimes a paragraphs ends before or after or right on a curly brace.
 Have to respectfully disagree on this, don't try to nail everything on 
 contracts.

Contracts don't replace unittests, they complement each other.
 They are nice but have little value over plain assert 
 _unless_ we are talking about classes and _inheritance_, which isn't the 
 case here.

It's easy to forget to test the output of a function, the "out" contracts help here. In structs the invariant helps you avoid forgetting to call manually a sanity test function every time you come in and out of a method.
 And there are lots of asserts here, but much more of input is 
 enforced since it's totally expected to supply wrong pattern (or have an 
 outside  user to type in the pattern).

The idea is to replace those enforces with asserts, and allow user programs to import Phobos stuff that still contain asserts (from a secondary Phobos lib). Enforces are for certain kinds of user code, I don't think they are fit in Phobos. Bye, bearophile
Aug 10 2011
next sibling parent reply Adam D. Ruppe <destructionator gmail.com> writes:
bearophile:

The thing is just because you call it a problem a lot doesn't mean
everyone else sees it that way.

A lot of us have many years of experience and just don't see it the
same way you do.
Aug 10 2011
next sibling parent bearophile <bearophileHUGS lycos.com> writes:
Adam D. Ruppe:

 A lot of us have many years of experience and just don't see it the
 same way you do.

This "you" is a group that includes people like Guido V. Rossum, Rob Pike, Ken Thompson and R. Hettinger (they have feelings even stronger than mine on this topic). Bye, bearophile
Aug 10 2011
prev sibling next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday, August 10, 2011 21:42:01 Marco Leise wrote:
 Am 10.08.2011, 19:24 Uhr, schrieb Adam D. Ruppe
 
 <destructionator gmail.com>:
 bearophile:
 
 The thing is just because you call it a problem a lot doesn't mean
 everyone else sees it that way.
 
 A lot of us have many years of experience and just don't see it the
 same way you do.

I think a blank line makes code easier on the eyes. When you scroll over it you recognize easily where you are from the size and shape of the paragraphs. So I totally understand that. On the other hand my laptop screen is 1280x800 and I also feel that sometimes I think I scroll over the end of a function body when there is just a blank line in a block of code. So usually I go with the approach of inserting a comment line instead of a blank line, which is usually italic and in a brighter color. If I was working on a Phobos module I would try to mime existing code style (and probably find out that there is no common style :p ). Anyway such things can be up to a vote just like the idea to not use single capital letters only for template type placeholders (i.e. T, S). Google's code style wiki is nice. It lists all the rules and also offers an explanation. We can have that for Phobos, too. So topics like these don't come up over and over again. The D style guide is a good start: http://www.digitalmars.com/d/2.0/dstyle.html

This sort of thing has been discussed by the Phobos dev team previously, and the general consensus was not to enforce much in the way of formatting in a style guide. There a few things that were agreed upon (such as always putting braces on their own line), but on the whole, the style guide is supposed to focus on the API (so, things like function and variable names) rather than how code is formatted. I have an update to the style guide as a pull request which is currently being reviewed to make sure that the style guide on the site is in line with what we do: https://github.com/D-Programming-Language/d-programming-language.org/pull/16 But I'm certain that you're not going to get the Phobos devs to agree on a style guide like Bearophile wants. And honestly, I'm a bit tired of the topic coming up. The does need some updates, but it's mostly correct. It's essentially what we've decided on, and I don't see any reason to keep discussing it over and over. Personally, I'd prefer that Dmitry had more blank lines in his code, but it's up to him how he does that as long as his code falls within the rules set down by the D style guide. And for any of his code which isn't going into Phobos, it's completely up to him how to format it. - Jonathan M Davis
Aug 10 2011
parent reply simendsjo <simendsjo gmail.com> writes:
On 10.08.2011 22:12, Jonathan M Davis wrote:
 There a few things that were agreed upon (such as always putting
 braces on their own line),

There is? Parallelism and json uses braces on the same line.
Aug 10 2011
parent reply simendsjo <simendsjo gmail.com> writes:
On 10.08.2011 23:16, Jonathan M Davis wrote:
 On 10.08.2011 22:12, Jonathan M Davis wrote:
 There a few things that were agreed upon (such as always putting
 braces on their own line),

There is? Parallelism and json uses braces on the same line.

It was agreed upon, and where it has been noticed, it has been fixed. But as I said, the style guide needs updating on a few points. Braces on their own line is one of them. - Jonathan M Davis

Damn - I've been changing my D style to braces on the same line. It's great as I do most D coding on a small laptop. Guess I'll have to change it again :)
Aug 11 2011
parent simendsjo <simendsjo gmail.com> writes:
On 11.08.2011 11:04, Jonathan M Davis wrote:
 On Thursday, August 11, 2011 10:50:41 simendsjo wrote:
 On 10.08.2011 23:16, Jonathan M Davis wrote:
 On 10.08.2011 22:12, Jonathan M Davis wrote:
 There a few things that were agreed upon (such as always putting
 braces on their own line),

There is? Parallelism and json uses braces on the same line.

It was agreed upon, and where it has been noticed, it has been fixed. But as I said, the style guide needs updating on a few points. Braces on their own line is one of them. - Jonathan M Davis

Damn - I've been changing my D style to braces on the same line. It's great as I do most D coding on a small laptop. Guess I'll have to change it again :)

You're free to do your braces however you'd like in your own code, but any code submitted to Phobos or druntime needs to have the braces on their own line. - Jonathan M Davis

I actually like that a language has a "default" style. Java, C# and Python all has a default style that makes code easy to read regardless of who wrote it (of course, python has some enforced stuff with indentation). You can, for instance, break the style as much as you'd like in C#, but I've yet to see a library that uses a very different style. But then again.. Unless it's written in an obfuscated style, it doesn't really matter that much..
Aug 11 2011
prev sibling parent "Marco Leise" <Marco.Leise gmx.de> writes:
Am 10.08.2011, 22:12 Uhr, schrieb Jonathan M Davis <jmdavisProg gmx.com>:

 [...] I don't see any reason to keep discussing it over and over.

You see, and that is why we should make that explicit rather than implicit in the style guide. An additional point "personal preference" could list "blank lines to group logical blocks of code".
Aug 10 2011
prev sibling next sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 10.08.2011 21:11, bearophile wrote:
 Dmitry Olshansky:

 Honestly I can't get why you are so nervous about code style anyway, you
 seem to bring this up way to often.

 About spaces personally I dislike eating extra vertical space for
 "clarity", curly braces on it's own line is already way too much.


Braces *are* paragraphs of code, with proper indention it's more then enough to fell the structure. If I really need to stop in the middle function, it's to explain something, then a single line of comment instead of meaningless empty line (which leaves reader clueless as to why) is good enough. Except that I'm not opposed to spaces at global scope.
 Have to respectfully disagree on this, don't try to nail everything on
 contracts.


 They are nice but have little value over plain assert
 _unless_ we are talking about classes and _inheritance_, which isn't the
 case here.

 And there are lots of asserts here, but much more of input is
 enforced since it's totally expected to supply wrong pattern (or have an
 outside  user to type in the pattern).


etc. You can't assert that something external won't fail. While you'd normally assert on your local logical invariants. As for other things I thought e.g. ranges are already hooked on asserts, as much as other templates. If you have a list of modules where you find the lack of compiled in contracts/asserts unbearable, do tell. I hate being drugged in these discussions, but just can't resist. -- Dmitry Olshansky
Aug 10 2011
next sibling parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 10.08.2011 22:11, Vladimir Panteleev wrote:
 On Wed, 10 Aug 2011 20:59:27 +0300, Dmitry Olshansky 
 <dmitry.olsh gmail.com> wrote:

 About spaces personally I dislike eating extra vertical space for
 "clarity", curly braces on it's own line is already way too much.

paragraphs. In code it's the same. Some empty lines are good to improve readability of the code. Curly braces are not always present, sometimes a paragraphs ends before or after or right on a curly brace.

Braces *are* paragraphs of code, with proper indention it's more then enough to fell the structure. If I really need to stop in the middle function, it's to explain something, then a single line of comment instead of meaningless empty line (which leaves reader clueless as to why) is good enough. Except that I'm not opposed to spaces at global scope.

I agree with bearophile; I find code that leaves a blank line between closely-related lines make the code much more readable. I don't understand what's with the craving for maximum vertical terseness either, but that may be because the resolution of my primary monitor is currently 1200x1920 :)

this league of abundant vertical space :) -- Dmitry Olshansky
Aug 10 2011
prev sibling parent bearophile <bearophileHUGS lycos.com> writes:
Dmitry Olshansky:

 Braces *are* paragraphs of code,

They sometimes are, but inside functions there are other kinds of "paragraphs". As an example, this is first-quality C code (partially written by R. Hettinger): http://hg.python.org/cpython/file/d5b274a0b0a5/Modules/_collectionsmodule.c If you take a random function from that page, like: 653 static int 654 deque_del_item(dequeobject *deque, Py_ssize_t i) 655 { 656 PyObject *item; 657 658 assert (i >= 0 && i < deque->len); 659 if (_deque_rotate(deque, -i) == -1) 660 return -1; 661 662 item = deque_popleft(deque, NULL); 663 assert (item != NULL); 664 Py_DECREF(item); 665 666 return _deque_rotate(deque, i); 667 } You see a blank line after "Py_DECREF(item);" despite there is no closing brace. The purpose of those blank lines is to help the person that reads the code to tell apart the various things done by that function. This is C code is well written.
 No gonna work, file I/O is certainly in Phobos, as are network sockets, 
 etc. You can't assert that something external won't fail.

OK.
 I hate being drugged in these discussions, but just can't resist.

I am sorry, but thank you for answering :-) Bye, bearophile
Aug 10 2011
prev sibling next sibling parent reply Don <nospam nospam.com> writes:
bearophile wrote:
 Contracts don't replace unittests, they complement each other.
 
 
 They are nice but have little value over plain assert 
 _unless_ we are talking about classes and _inheritance_, which isn't the 
 case here.

It's easy to forget to test the output of a function, the "out" contracts help here. In structs the invariant helps you avoid forgetting to call manually a sanity test function every time you come in and out of a method.

You're conflating a couple of things here. Invariants are tremendously helpful for structs as well as classes. "out" contracts seem to be almost useless, unless you have a theorem prover. The reason is, that they test nothing apart from the function they are attached to, and it's much better to do that with unittesting. They have very little in common with 'in' contracts. I think that EVERY struct and class in Phobos should have an invariant (except for something like Complex, where there are no invalid values). But I don't think 'out' contracts would add much value at all.
Aug 10 2011
next sibling parent reply Lutger Blijdestijn <lutger.blijdestijn gmail.com> writes:
Don wrote:

 bearophile wrote:
 Contracts don't replace unittests, they complement each other.
 
 
 They are nice but have little value over plain assert
 _unless_ we are talking about classes and _inheritance_, which isn't the
 case here.

It's easy to forget to test the output of a function, the "out" contracts help here. In structs the invariant helps you avoid forgetting to call manually a sanity test function every time you come in and out of a method.

You're conflating a couple of things here. Invariants are tremendously helpful for structs as well as classes. "out" contracts seem to be almost useless, unless you have a theorem prover. The reason is, that they test nothing apart from the function they are attached to, and it's much better to do that with unittesting. They have very little in common with 'in' contracts. I think that EVERY struct and class in Phobos should have an invariant (except for something like Complex, where there are no invalid values). But I don't think 'out' contracts would add much value at all.

What about out contracts on interfaces in a library (where you use the library by implementing them).
Aug 10 2011
parent Don <nospam nospam.com> writes:
Lutger Blijdestijn wrote:
 Don wrote:
 
 bearophile wrote:
 Contracts don't replace unittests, they complement each other.


 They are nice but have little value over plain assert
 _unless_ we are talking about classes and _inheritance_, which isn't the
 case here.

help here. In structs the invariant helps you avoid forgetting to call manually a sanity test function every time you come in and out of a method.

helpful for structs as well as classes. "out" contracts seem to be almost useless, unless you have a theorem prover. The reason is, that they test nothing apart from the function they are attached to, and it's much better to do that with unittesting. They have very little in common with 'in' contracts. I think that EVERY struct and class in Phobos should have an invariant (except for something like Complex, where there are no invalid values). But I don't think 'out' contracts would add much value at all.

What about out contracts on interfaces in a library (where you use the library by implementing them).

That involves inheritance. But I don't think there are any cases in Phobos where that is currently applicable.
Aug 10 2011
prev sibling next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Don:

"out" contracts seem to be almost useless, unless you have a theorem prover.
The reason is, that they test nothing apart from the function they are attached
to, and it's much better to do that with unittesting.<

I see three different situations where postconditions are useful in D: 1) Sometimes the result of your function/method must satisfy some simple condition to be correct. As example, it must be a nonnegative number. Then you add assert(result >= 0, "..."); in the out. For a Phobos example, std.algorithm.countUntil postcondition is allowed to test assert(result >= -1, "..."); Other possible conditions are the output string can't be longer than a certain amount (like longer than the input string), and so on. In certain cases the program the finds the solution is slow, but testing the correctness of a function is fast. I have hit many situations like this. As an example you test if the result of a complex sorting algorithm is ordered, and with the same length of the input (but maybe you don't test for the output items to be the same of the input). 2) I have found many situations where I am able to solve a problem with both a simple and slow brute force solver, and a complex and fast algorithm to solve a problem. The little program maybe is too much slow for normal usage, but it's just few lines long (especially if I use lot of std.algorithm stuff) but it's much less likely to contain bugs. You can't always verify the result of the fast algorithm with the slow algorithm, this is not useful. In such situations I write the postcondition like this: in { // ... } out(result) { // some fast postconditon tests here debug { assert(result == slowAlgorithm(input)); } body { // fast algorithm here } This way, in release mode it tests nothing, in nonrelease build it tests the fast postconditions, and in debug mode it also verifies the fast algorithm gives the same results as the slow algorithm. Generally solving a problem in two quite different ways helps catch problems in the algorithms. 3) When D will get the prestate ("old" in some contract programming implementations), I will be able to use the prestate inside the postcondition to verify better than the function/method has changed the globals, or instance attributes in a correct way. You can't put such tests in the class/struct invariant, or in the precondition. I'm using postconditions often in my code (less often than preconditions, but often enough). A theorem prover is not strictly necessary for them to be useful. Bye, bearophile
Aug 11 2011
parent reply Don <nospam nospam.com> writes:
bearophile wrote:
 Don:
 
 "out" contracts seem to be almost useless, unless you have a theorem prover.
The reason is, that they test nothing apart from the function they are attached
to, and it's much better to do that with unittesting.<

I see three different situations where postconditions are useful in D: 1) Sometimes the result of your function/method must satisfy some simple condition to be correct. As example, it must be a nonnegative number. Then you add assert(result >= 0, "..."); in the out. For a Phobos example, std.algorithm.countUntil postcondition is allowed to test assert(result >= -1, "..."); Other possible conditions are the output string can't be longer than a certain amount (like longer than the input string), and so on. In certain cases the program the finds the solution is slow, but testing the correctness of a function is fast. I have hit many situations like this. As an example you test if the result of a complex sorting algorithm is ordered, and with the same length of the input (but maybe you don't test for the output items to be the same of the input). 2) I have found many situations where I am able to solve a problem with both a simple and slow brute force solver, and a complex and fast algorithm to solve a problem. The little program maybe is too much slow for normal usage, but it's just few lines long (especially if I use lot of std.algorithm stuff) but it's much less likely to contain bugs.

Sorry, but personally I don't believe that this is useful outside of toy examples. The question is, what bugs does it find that aren't found by a trivial unit test?
 You can't always verify the result of the fast algorithm with the slow
algorithm, this is not useful.
 In such situations I write the postcondition like this:
 
 in {
     // ...
 } 
 out(result) {
     // some fast postconditon tests here
  
     debug {
         assert(result == slowAlgorithm(input));
     }
 body {
     // fast algorithm here
 }
 
 
 This way, in release mode it tests nothing, in nonrelease build it tests the
fast postconditions, and in debug mode it also verifies the fast algorithm
gives the same results as the slow algorithm. Generally solving a problem in
two quite different ways helps catch problems in the algorithms.
 
 
 3) When D will get the prestate ("old" in some contract programming
implementations), I will be able to use the prestate inside the postcondition
to verify better than the function/method has changed the globals, or instance
attributes in a correct way. You can't put such tests in the class/struct
invariant, or in the precondition.

There are two cases: (1) it's a very tight test. In which case, it's essentially a unit test. or (2) it's a very loose test. In which case, it doesn't find bugs.
 I'm using postconditions often in my code (less often than preconditions, but
often enough). A theorem prover is not strictly necessary for them to be useful.

I would like to see an example of a good postcondition. The crucial feature is, they do NOTHING except find bugs in the function they are attached to. So it's very difficult to invent a plausible one. For starters, it really needs to be a function with multiple return values. Otherwise, you can just stick asserts just before your return statement, and you don't need __old or any such thing. Under what circumstances are they are more valuable than any other assert inside a function?
Aug 11 2011
next sibling parent Adam D. Ruppe <destructionator gmail.com> writes:
If it's worth anything, I use the out contracts in dom.d more as
checked documentation than for serious bug-finding.

For example:

Element appendChild(Element newChild)
out (ret) { assert(ret is newChild); }
body { ... }

I also use it from time to time to assert that a return value is not
null. The check itself isn't particularly useful, but I think it's
a nice bit of documentation.

Actually, IMO, in and out contracts should be in the generated
ddoc too.
Aug 11 2011
prev sibling next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Don:

 2) I have found many situations where I am able to solve a problem with both a
simple and slow brute force solver, and a complex and fast algorithm to solve a
problem. The little program maybe is too much slow for normal usage, but it's
just few lines long (especially if I use lot of std.algorithm stuff) but it's
much less likely to contain bugs.


 Sorry, but personally I don't believe that this is useful outside of toy
examples.
 The question is, what bugs does it find that aren't found by a trivial unit
test?

 There are two cases:
 (1) it's a very tight test. In which case, it's essentially a unit test.
 or (2) it's a very loose test. In which case, it doesn't find bugs.

Putting a simpler algorithm in the post-condition implements a third possibility you are missing. Usually unit tests verify some specific cases (you are also able to add generic testing code in the unit test, but this is just like moving the postcondition elsewhere). If you put an alternative algorithm in the postcondition (under debug{} if you want), you have some advantages: - It's tight, because the second algorithm is supposed to always give the same results as the function. - It works with the real examples the program is run too, not just the cases you have put in the unit test. Sometimes you forget to add certain cases in the unittests. Putting the test in the postcondition makes sure it always run, for all the inputs your function is run on (unless you disable it), so you will catch the cases you didn't think of in the unittests.
 The crucial feature is, they do NOTHING except find bugs in the function
 they are attached to.

In Eiffel you have the prestate too (the old), so the postcondition is the only place where such information is usable. I hope prestate will be added to D DbC, because it's a majob sub-feature of DbC. But I don't agree that postconditions are useless in D.
 For starters, it really needs to be a function with multiple return
 values. Otherwise, you can just stick asserts just before your return
 statement, and you don't need __old or any such thing.

If a function has multiple return values the out(result) helps make sure all the return paths are verified. If the function has only one return value it helps anyway, because it helps you not forget to verify the result.
 Under what circumstances are they are more valuable than any other assert
inside a function?

I have already given some answers. Another answer is this: int foo(int x) in { // ... } out(result) { auto y = computeSomething(result); assert(y ...); assert(y ...); } body { // ... } The out{} helps you organize your code, separating the tests of the body from the postcondition tests. Also in the postcondition you are allowed to define new variables and call things. All this out(){} code vanishes in release mode. Ho do you do that with just asserts inside the body? If you do this the asserts will vanish in release mode, but the y will be computed still, wasting computations (a smart compiler is able to see y is not used and etc, but it's not sure this optimization happens if the computation of y is complex and it's done in-place): int foo(int x) in { // ... } body { result = ...; auto y = computeSomething(result); assert(y ...); assert(y ...); return result; } I presume there are ways to disable the computation of y in release mode, but I don't want to think about them. I just stick the y computation in the postcondition and the compiler will take care of it. Bye, bearophile
Aug 11 2011
parent reply Don <nospam nospam.com> writes:
bearophile wrote:
 Don:
 
 2) I have found many situations where I am able to solve a problem with both a
simple and slow brute force solver, and a complex and fast algorithm to solve a
problem. The little program maybe is too much slow for normal usage, but it's
just few lines long (especially if I use lot of std.algorithm stuff) but it's
much less likely to contain bugs.


 Sorry, but personally I don't believe that this is useful outside of toy
examples.
 The question is, what bugs does it find that aren't found by a trivial unit
test?

 There are two cases:
 (1) it's a very tight test. In which case, it's essentially a unit test.
 or (2) it's a very loose test. In which case, it doesn't find bugs.

Putting a simpler algorithm in the post-condition implements a third possibility you are missing. Usually unit tests verify some specific cases (you are also able to add generic testing code in the unit test, but this is just like moving the postcondition elsewhere). If you put an alternative algorithm in the postcondition (under debug{} if you want), you have some advantages: - It's tight, because the second algorithm is supposed to always give the same results as the function.

 - It works with the real examples the program is run too, not just the cases
you have put in the unit test.

Conditions required for this to be true: (1) the function must not be time critical; (2) an alternative algorithm must exist; (3) the alternative algorithm must be bug-free; (4) the function must not have been tested properly; (5) the faulty test cases must occur during debugging (they won't be caught during production); (6) the programmer must remember to put the asserts in the 'out' contract, but not put them into the body of the function. This doesn't leave much. Sometimes you forget to add certain cases in the unittests. Putting the test in the postcondition makes sure it always run, for all the inputs your function is run on (unless you disable it), so you will catch the cases you didn't think of in the unittests.
 
 
 The crucial feature is, they do NOTHING except find bugs in the function
 they are attached to.

In Eiffel you have the prestate too (the old), so the postcondition is the only place where such information is usable. I hope prestate will be added to D DbC, because it's a majob sub-feature of DbC. But I don't agree that postconditions are useless in D.

??? Does that relate to my sentence in any way?
 For starters, it really needs to be a function with multiple return
 values. Otherwise, you can just stick asserts just before your return
 statement, and you don't need __old or any such thing.

If a function has multiple return values the out(result) helps make sure all the return paths are verified.

That's what I said.
 If the function has only one return value it helps anyway, because it helps
you not forget to verify the result.

???? Why would you remember to put an assert in the postcondition, when you didn't put it into the function?
 
 
 Under what circumstances are they are more valuable than any other assert
inside a function?

I have already given some answers.

No you haven't.
 Another answer is this:
 
 
 int foo(int x)
 in {
     // ...
 }
 out(result) {
     auto y = computeSomething(result);
     assert(y ...);
     assert(y ...);
 }
 body {
     // ...
 }
 
 The out{} helps you organize your code, separating the tests of the body from
the postcondition tests. Also in the postcondition you are allowed to define
new variables and call things. All this out(){} code vanishes in release mode.
Ho do you do that with just asserts inside the body?
 
 
 If you do this the asserts will vanish in release mode, but the y will be
computed still, wasting computations (a smart compiler is able to see y is not
used and etc, but it's not sure this optimization happens if the computation of
y is complex and it's done in-place):
 
 int foo(int x)
 in {
     // ...
 }
 body {
     result = ...;
     auto y = computeSomething(result);
     assert(y ...);
     assert(y ...);
     return result;
 }
 
 
 I presume there are ways to disable the computation of y in release mode, but
I don't want to think about them. I just stick the y computation in the
postcondition and the compiler will take care of it.

Trivial! Make the postcondition a nested function. (You can even make it a delegate literal, if it's only used in one place). I'll explain my original statement further: If you have a theorem prover, then the theorem prover can use the 'out' contract in any function which calls that function. Eg, int square(int x) out { assert(result>=0); } body { return x*x; } void foo() { int q = square(-5); if (q < 0) { .... } } Theorem prover knows that q>=0, even if it doesn't have access to the body of 'square'. So it detects unreachable code in foo(). So in this case, the 'out' contract can be used to find bugs in code that the author of the contract didn't write. Otherwise, out contracts only find bugs in the local function, which doesn't have much value, since unit testing already performs that role (and does it better). By contrast, 'in' functions ALWAYS find external bugs rather than local ones, so they're an order of magnitude more valuable in the current implementation.
Aug 12 2011
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 08/12/2011 01:31 PM, Don wrote:
 bearophile wrote:
 Don:

 2) I have found many situations where I am able to solve a problem
 with both a simple and slow brute force solver, and a complex and
 fast algorithm to solve a problem. The little program maybe is too
 much slow for normal usage, but it's just few lines long (especially
 if I use lot of std.algorithm stuff) but it's much less likely to
 contain bugs.


 Sorry, but personally I don't believe that this is useful outside of
 toy examples.
 The question is, what bugs does it find that aren't found by a
 trivial unit test?

 There are two cases:
 (1) it's a very tight test. In which case, it's essentially a unit test.
 or (2) it's a very loose test. In which case, it doesn't find bugs.

Putting a simpler algorithm in the post-condition implements a third possibility you are missing. Usually unit tests verify some specific cases (you are also able to add generic testing code in the unit test, but this is just like moving the postcondition elsewhere). If you put an alternative algorithm in the postcondition (under debug{} if you want), you have some advantages: - It's tight, because the second algorithm is supposed to always give the same results as the function.

 - It works with the real examples the program is run too, not just the
 cases you have put in the unit test.

Conditions required for this to be true: (1) the function must not be time critical;

If the difference is not an asymptotic one, it can well be time critical (then the debug version will just not be as responsive as would be desirable for a finished product, which is often the case anyways.)
 (2) an alternative algorithm must exist;

If an optimized version exists, a slower one exists too.
 (3) the alternative algorithm must be bug-free;

That is often trivial. Also, if it is buggy, the discrepancy will be caught by the contract and the bug can be fixed.
 (4) the function must not have been tested properly;

Usually, large software that has been 'tested properly' still contains bugs. For mission critical tasks, a form of testing related to this one is used heavily (multiple teams implement the same specification and the result of each query to the software is determined by majority vote).
 (5) the faulty test cases must occur during debugging (they won't be
 caught during production);

Sure. This can catch eg. regressions during development, If there is a large team of programmers involved, contracts are more useful than if there is only a single developer.
 (6) the programmer must remember to put the asserts in the 'out'
 contract, but not put them into the body of the function.

Well, if he they are a seasoned contract programmer, this is not a problem at all. ;)
 This doesn't leave much.

I disagree.
 Sometimes you forget to add certain cases in the unittests. Putting the
 test in the postcondition makes sure it always run, for all the inputs
 your function is run on (unless you disable it), so you will catch the
 cases you didn't think of in the unittests.

 The crucial feature is, they do NOTHING except find bugs in the function
 they are attached to.



They specify what the function is supposed to do, in a way that always is up to date because it gets checked.
 In Eiffel you have the prestate too (the old), so the postcondition is
 the only place where such information is usable. I hope prestate will
 be added to D DbC, because it's a majob sub-feature of DbC. But I
 don't agree that postconditions are useless in D.

??? Does that relate to my sentence in any way?

Yes. He says that once prestate is available, out contracts will be more useful. But he thinks they are already quite valuable without them.
 For starters, it really needs to be a function with multiple return
 values. Otherwise, you can just stick asserts just before your return
 statement, and you don't need __old or any such thing.

If a function has multiple return values the out(result) helps make sure all the return paths are verified.

That's what I said.
 If the function has only one return value it helps anyway, because it
 helps you not forget to verify the result.

???? Why would you remember to put an assert in the postcondition, when you didn't put it into the function?

Don wrote
 bearophile wrote:
 If a function has multiple return values the out(result) helps make
 sure all the return paths are verified.

That's what I said.

Exactly that reason: int foo(){ // some code if(condition) return 37; // added after 2h of debugging // more code result=...; assert(condition(result)); return result; } int foo() out(result){assert(condition(result));} body{ //some code if(condition) return 37; // more code return ...; } it is both more convenient (you don't have to change your program logic) and less error-prone. Furthermore, all other programmers on the project can immediately check the postcondition and rely on that it holds for the result of any call of foo, even if the compiler does not use the out contract for any theorem proving. They can even do that before the respective function is implemented correctly. Out contracts are particularly useful when they are written before the function has is implemented completely.
 Under what circumstances are they are more valuable than any other
 assert inside a function?

I have already given some answers.

No you haven't.
 Another answer is this:


 int foo(int x)
 in {
 // ...
 }
 out(result) {
 auto y = computeSomething(result);
 assert(y ...);
 assert(y ...);
 }
 body {
 // ...
 }

 The out{} helps you organize your code, separating the tests of the
 body from the postcondition tests. Also in the postcondition you are
 allowed to define new variables and call things. All this out(){} code
 vanishes in release mode. Ho do you do that with just asserts inside
 the body?


 If you do this the asserts will vanish in release mode, but the y will
 be computed still, wasting computations (a smart compiler is able to
 see y is not used and etc, but it's not sure this optimization happens
 if the computation of y is complex and it's done in-place):

 int foo(int x)
 in {
 // ...
 }
 body {
 result = ...;
 auto y = computeSomething(result);
 assert(y ...);
 assert(y ...);
 return result;
 }


 I presume there are ways to disable the computation of y in release
 mode, but I don't want to think about them. I just stick the y
 computation in the postcondition and the compiler will take care of it.

Trivial! Make the postcondition a nested function. (You can even make it a delegate literal, if it's only used in one place).

Because everyone who is working on the project wants to check nested functions? Sure it works, but it is not the best way to implement contract programming. That is why D has language support that goes beyond that.
 I'll explain my original statement further: If you have a theorem
 prover, then the theorem prover can use the 'out' contract in any
 function which calls that function.

 Eg,
 int square(int x) out { assert(result>=0); } body { return x*x; }

 void foo()
 {
 int q = square(-5);
 if (q < 0) { .... }

 }
 Theorem prover knows that q>=0, even if it doesn't have access to the
 body of 'square'. So it detects unreachable code in foo().

Theorem prover detects bug in square.
 So in this case, the 'out' contract can be used to find bugs in code
 that the author of the contract didn't write.
 Otherwise, out contracts only find bugs in the local function, which
 doesn't have much value, since unit testing already performs that role
 (and does it better).

In this case, obviously all the unit tests tested square with an input that was less than 2^^16 in absolute value. Writing the postcondition sometimes also allows you to reflect properly on what the precondition should be. Also, it will be tested on possibly unexpected input.
 By contrast, 'in' functions ALWAYS find external bugs rather than local
 ones, so they're an order of magnitude more valuable in the current
 implementation.

Not always. Sometimes they find bugs in the specification or the in contract itself. Contracts are not only an instrument of verification, but also one of specification. http://en.wikipedia.org/wiki/Design_by_contract The out contract is not there to verify some internal consistency conditions, but to specify what the function should compute, in an exact way, that is always up to date. The out contract is for programmers too, not only for compilers. Contract programming is one of these Software Engineering things. :) The crucial difference between out contract and an assert at the end of the function is how they are supposed to be used, not how they will work. This is reflected by the fact that DMD *.di generation will keep the contracts around. -Timon
Aug 12 2011
prev sibling parent bearophile <bearophileHUGS lycos.com> writes:
Don:

 bearophile wrote:
 2) I have found many situations where I am able to solve a problem with both a
 simple and slow brute force solver, and a complex and fast algorithm to solve
 a problem. The little program maybe is too much slow for normal usage, but
 it's just few lines long (especially if I use lot of std.algorithm stuff)
 but it's much less likely to contain bugs.

Sorry, but personally I don't believe that this is useful outside of toy examples.

This code of mine is a real-world example. This is a struct method with comments removed, the postcondition contains both fast loose tests and a tight slow O(n^2) version that thanks to std.algorithm is just 2 lines long (unfortunately because of DMD bug 6417 it's a bit longer than 2 lines). It's asymptotically slower than the fast algorithm, so I've put it into a debug{}. void foo(in int[] p, int[] q) nothrow in { assert(p.length == vectorLen); assert(q.length == vectorLen); assert(equal(p.dup.sort(), iota(1, vectorLen+1))); } out { foreach (i, qi; q) assert(qi >= 0 && qi < (vectorLen - i)); debug foreach (j; 1 .. (q.length + 1)) assert(q[j-1] == count!((int k){ return p[k] > j; })(iota(countUntil(cast()p, j) + 1))); } body { op[0] = &items[0]; foreach (i, pi; p) { items[i] = Item(pi, 0); op[i + 1] = &items[i + 1]; } foreach_reverse (k; 0 .. (lim + 1)) { xs[0 .. ((vectorLen >> (k + 1)) + 1)] = 0; foreach (j; 0 .. vectorLen) { int r = (op[j].space >> k) % 2; int s = op[j].space >> (k + 1); if (r) xs[s]++; else op[j].digit += xs[s]; } } foreach (i; 0 .. vectorLen) q[op[i].space - 1] = op[i].digit; } This postcondition has caught a simple mistake I've put in the fast algorithm. Probably there are ways to catch the same bug with unittests too. The ugly empty cast() inside the postcondition is another workaround, because countUntil doesn't work with a const p. If you write those two postcondition lines in Python3 it becomes less noisy: assert q == [sum(p[k] > j for k in range(p.index(j) + 1)) for j in range(1, len(q)+1)] Instead of: foreach (j; 1 .. q.length+1) assert(q[j-1] == count!((int k){ return p[k] > j; })(iota(countUntil(cast()p, j) + 1))); Here using assert(equal(q, map!...)) becomes too much puzzle-code. It's already too much nested. If you program in functional-style it's hard to write lines of 70 chars. In Haskell too lines of code are often long. Bye, bearophile
Aug 12 2011
prev sibling parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 11.08.2011 8:58, Don wrote:
 bearophile wrote:
 Contracts don't replace unittests, they complement each other.


 They are nice but have little value over plain assert _unless_ we 
 are talking about classes and _inheritance_, which isn't the case here.

It's easy to forget to test the output of a function, the "out" contracts help here. In structs the invariant helps you avoid forgetting to call manually a sanity test function every time you come in and out of a method.

You're conflating a couple of things here. Invariants are tremendously helpful for structs as well as classes.

I stand corrected about invariants, somehow I wasn't considering them a part of contracts.
 "out" contracts seem to be almost useless, unless you have a theorem 
 prover. The reason is, that they test nothing apart from the function 
 they are attached to, and it's much better to do that with unittesting.
 They have very little in common with 'in' contracts.

 I think that EVERY struct and class in Phobos should have an invariant 
 (except for something like Complex, where there are no invalid values).
 But I don't think 'out' contracts would add much value at all.

-- Dmitry Olshansky
Aug 11 2011
prev sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Thursday, August 11, 2011 06:58:51 Don wrote:
 I think that EVERY struct and class in Phobos should have an invariant
 (except for something like Complex, where there are no invalid values).
 But I don't think 'out' contracts would add much value at all.

That would be great, but several bugs need to be fixed before that's possible, including http://d.puremagic.com/issues/show_bug.cgi?id=1251 http://d.puremagic.com/issues/show_bug.cgi?id=5039 http://d.puremagic.com/issues/show_bug.cgi?id=5058 http://d.puremagic.com/issues/show_bug.cgi?id=5500 - Jonathan M Davis
Aug 10 2011
prev sibling next sibling parent Jacob Carlborg <doob me.com> writes:
On 2011-08-10 18:02, bearophile wrote:
 Dmitry Olshansky:

 To get a small no-crap-included beta package see download section of
 https://github.com/blackwhale/FReD for .7zs.

When you write some English text you don't write a single block of text, you organize it into paragraphs, and paragraphs into chapters, chapters into sections, sections into books, etc. Time ago I have understood that paragraphs are very good in source code too. So I suggest you to add a blank line here and there inside your functions to separate them into paragraphs. I can't give you a style rule, you will need to create your own style, but often a function that's more than 10 lines line long needs one or more blank lines inside (some people say that every time you see one of such paragraphs in a function, especially if it has a comment before it, then you need to perform an "extract method" to improve the code. I believe this is a bad advice).

I always add a blank line before and after statements. -- /Jacob Carlborg
Aug 10 2011
prev sibling next sibling parent "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Wed, 10 Aug 2011 20:59:27 +0300, Dmitry Olshansky  
<dmitry.olsh gmail.com> wrote:

 About spaces personally I dislike eating extra vertical space for
 "clarity", curly braces on it's own line is already way too much.

In code it's the same. Some empty lines are good to improve readability of the code. Curly braces are not always present, sometimes a paragraphs ends before or after or right on a curly brace.

Braces *are* paragraphs of code, with proper indention it's more then enough to fell the structure. If I really need to stop in the middle function, it's to explain something, then a single line of comment instead of meaningless empty line (which leaves reader clueless as to why) is good enough. Except that I'm not opposed to spaces at global scope.

I agree with bearophile; I find code that leaves a blank line between closely-related lines make the code much more readable. I don't understand what's with the craving for maximum vertical terseness either, but that may be because the resolution of my primary monitor is currently 1200x1920 :) -- Best regards, Vladimir mailto:vladimir thecybershadow.net
Aug 10 2011
prev sibling next sibling parent "Marco Leise" <Marco.Leise gmx.de> writes:
Am 10.08.2011, 19:24 Uhr, schrieb Adam D. Ruppe  
<destructionator gmail.com>:

 bearophile:

 The thing is just because you call it a problem a lot doesn't mean
 everyone else sees it that way.

 A lot of us have many years of experience and just don't see it the
 same way you do.

I think a blank line makes code easier on the eyes. When you scroll over it you recognize easily where you are from the size and shape of the paragraphs. So I totally understand that. On the other hand my laptop screen is 1280x800 and I also feel that sometimes I think I scroll over the end of a function body when there is just a blank line in a block of code. So usually I go with the approach of inserting a comment line instead of a blank line, which is usually italic and in a brighter color. If I was working on a Phobos module I would try to mime existing code style (and probably find out that there is no common style :p ). Anyway such things can be up to a vote just like the idea to not use single capital letters only for template type placeholders (i.e. T, S). Google's code style wiki is nice. It lists all the rules and also offers an explanation. We can have that for Phobos, too. So topics like these don't come up over and over again. The D style guide is a good start: http://www.digitalmars.com/d/2.0/dstyle.html
Aug 10 2011
prev sibling parent "Marco Leise" <Marco.Leise gmx.de> writes:
Am 11.08.2011, 19:56 Uhr, schrieb Adam D. Ruppe  
<destructionator gmail.com>:

 If it's worth anything, I use the out contracts in dom.d more as
 checked documentation than for serious bug-finding.

 For example:

 Element appendChild(Element newChild)
 out (ret) { assert(ret is newChild); }
 body { ... }

 I also use it from time to time to assert that a return value is not
 null. The check itself isn't particularly useful, but I think it's
 a nice bit of documentation.

 Actually, IMO, in and out contracts should be in the generated
 ddoc too.

I've been wondering for a while if selective unit tests could be included in DDOC somehow. Most of the 'examples' in the Phobos documentation look like they were taken right out of a unittest block blow the function. Like BinaryHeap in std.containers: ---------------------------------------------------------------------- DDOC: // Example from "Introduction to Algorithms" Cormen et al, p 146 int[] a = [ 4, 1, 3, 2, 16, 9, 10, 14, 8, 7 ]; auto h = heapify(a); // largest element assert(h.front == 16); // a has the heap property assert(equal(a, [ 16, 14, 10, 9, 8, 7, 4, 3, 2, 1 ])); ---------------------------------------------------------------------- std/containers.d: unittest { { // example from "Introduction to Algorithms" Cormen et al., p 146 int[] a = [ 4, 1, 3, 2, 16, 9, 10, 14, 8, 7 ]; auto h = heapify(a); assert(h.front == 16); assert(a == [ 16, 14, 10, 8, 7, 9, 3, 2, 4, 1 ]); auto witness = [ 16, 14, 10, 9, 8, 7, 4, 3, 2, 1 ]; for (; !h.empty; h.removeFront(), witness.popFront()) { assert(!witness.empty); assert(witness.front == h.front); } assert(witness.empty); } { int[] a = [ 4, 1, 3, 2, 16, 9, 10, 14, 8, 7 ]; int[] b = new int[a.length]; BinaryHeap!(int[]) h = BinaryHeap!(int[])(b, 0); foreach (e; a) { h.insert(e); } assert(b == [ 16, 14, 10, 8, 7, 3, 9, 1, 4, 2 ], text(b)); } } ---------------------------------------------------------------------- bearophile, you are the expert with the DRY buzz word ;)
Aug 11 2011
prev sibling next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
 On 10.08.2011 22:12, Jonathan M Davis wrote:
 There a few things that were agreed upon (such as always putting
 braces on their own line),

There is? Parallelism and json uses braces on the same line.

It was agreed upon, and where it has been noticed, it has been fixed. But as I said, the style guide needs updating on a few points. Braces on their own line is one of them. - Jonathan M Davis
Aug 10 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Thursday, August 11, 2011 10:50:41 simendsjo wrote:
 On 10.08.2011 23:16, Jonathan M Davis wrote:
 On 10.08.2011 22:12, Jonathan M Davis wrote:
 There a few things that were agreed upon (such as always putting
 braces on their own line),

There is? Parallelism and json uses braces on the same line.

It was agreed upon, and where it has been noticed, it has been fixed. But as I said, the style guide needs updating on a few points. Braces on their own line is one of them. - Jonathan M Davis

Damn - I've been changing my D style to braces on the same line. It's great as I do most D coding on a small laptop. Guess I'll have to change it again :)

You're free to do your braces however you'd like in your own code, but any code submitted to Phobos or druntime needs to have the braces on their own line. - Jonathan M Davis
Aug 11 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Thursday, August 11, 2011 11:34:30 simendsjo wrote:
 On 11.08.2011 11:04, Jonathan M Davis wrote:
 On Thursday, August 11, 2011 10:50:41 simendsjo wrote:
 On 10.08.2011 23:16, Jonathan M Davis wrote:
 On 10.08.2011 22:12, Jonathan M Davis wrote:
 There a few things that were agreed upon (such as always putting
 braces on their own line),

There is? Parallelism and json uses braces on the same line.

It was agreed upon, and where it has been noticed, it has been fixed. But as I said, the style guide needs updating on a few points. Braces on their own line is one of them. - Jonathan M Davis

Damn - I've been changing my D style to braces on the same line. It's great as I do most D coding on a small laptop. Guess I'll have to change it again :)

You're free to do your braces however you'd like in your own code, but any code submitted to Phobos or druntime needs to have the braces on their own line. - Jonathan M Davis

I actually like that a language has a "default" style. Java, C# and Python all has a default style that makes code easy to read regardless of who wrote it (of course, python has some enforced stuff with indentation). You can, for instance, break the style as much as you'd like in C#, but I've yet to see a library that uses a very different style. But then again.. Unless it's written in an obfuscated style, it doesn't really matter that much..

Well, you're free to follow Phobos' style too. It's entirely up to you. But bracing style is the sort of thing that's likely to vary quite a bit from programmer to programmer (especially among those with a C or C++ background). - Jonathan M Davis
Aug 11 2011
prev sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 10.08.2011 14:42, Dmitry Olshansky wrote:
     In case I failed to mention it before, I m working on the project 
 codenamed FReD that is aimed at ~100%* source level compatible 
 overhaul of std.regex, that uses better implementation techniques, 
 provides modern Unicode support and common syntax riches.

     I think it's time for a public beta release,  since it _should_ be 
 ready for mainstream usage. There are some rough edges, and a couple 
 issues that I'm aware of but they are nowhere in realistic use cases.

     In order to avoid unexpected regressions I'd be glad if current 
 std.regex users do try it for their projects/tests.
 To get a small no-crap-included beta package see download section of 
 https://github.com/blackwhale/FReD for .7zs.
 I'll upload newer packages as bugs get exposed and fixed. 
 Alternatively, if you a comfortable with git you may just git clone 
 entire repo. Some helpful notes (same as README) can be found here : 
 https://github.com/blackwhale/FReD/wiki/Beta-release

 Caveats:
     In order for it compile a tiny change to 2.054 source is needed 
 (no need to recompile Phobos! it's only in templates):
 patch std.algorithm.cmp according to this diff 
 https://github.com/D-Programming-Language/phobos/pull/176/files#L0L4631
<https://github.com/D-Programming-Language/phobos/pull/176/files#L0L4633> 

 and to get CTFE features working add if(!__ctfe) listed in the next 
 diff on the same webpage.
 (this is already upstream, so if you're using a fork of phobos just 
 pull this in)

 * some API problems might lead to a breaking change, though it didn't 
 happen in this release

Meanwhile the new beta is up: https://github.com/downloads/blackwhale/FReD/FReD_beta1.7z or checkout "stable" branch https://github.com/blackwhale/FReD/tree/stable ( as dawgfoto noticed the master branch tend to break on 64-bit as I develop primarily on 32bit) With prominent changes being: - fixed a horrible memory corruption with regex having certain groups/backrefs in lookaround - no GC heap activity during matching in all engines, except as workaround for bug http://d.puremagic.com/issues/show_bug.cgi?id=6199 - new prefix searcher, featuring up to 40x search speed up on patterns with semi-fixed prefixes e.g. \b(https?|ftp|file)://\S+ and ([0-9][0-9]?)/([0-9][0-9]?)/([0-9][0-9]([0-9][0-9])?) - bool opCast for RegexMatch for nice "test if not empty syntax" as suggested by Steven - lots of small fixes and optimizations -- Dmitry Olshansky
Aug 16 2011
next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Dmitry Olshansky:

 To get a small no-crap-included beta package see download section of 
 https://github.com/blackwhale/FReD for .7zs.


I have not patched DMD, but it gives me some problem here: void parseFlags(S)(S flags) { foreach(ch; flags)//flags are ASCII anyway { switch(ch) { foreach(i, op; __traits(allMembers, RegexOption)) { case RegexOptionNames[i]: if(re_flags & mixin("RegexOption."~op)) throw new RegexException(text("redundant flag specified: ",ch)); re_flags |= mixin("RegexOption."~op); break; } default: if(__ctfe) assert(text("unknown regex flag '",ch,"'")); else new RegexException(text("unknown regex flag '",ch,"'")); } To better see the situation I have written a small test case: import std.typetuple: TypeTuple; enum RegexOption : uint { A, B, C } // no need to put a semicolon here alias TypeTuple!(RegexOption.A, RegexOption.B, RegexOption.C) RegexOptionNames; void main() { RegexOption ch; switch (ch) { foreach (i, op; __traits(allMembers, RegexOption)) case RegexOptionNames[i]: break; default: assert(0); } } test.d(12): Error: switch case fallthrough - use 'goto case;' if intended test.d(12): Error: switch case fallthrough - use 'goto case;' if intended test.d(12): Error: switch case fallthrough - use 'goto case;' if intended test.d(14): Error: switch case fallthrough - use 'goto default;' if intended This used to work, I think. The new DMD switch analysis seems to have a bug. ------------- If you want a benchmark, to compare it with other implementations, there is this one: http://shootout.alioth.debian.org/debian/program.php?test=regexdna&lang=gdc&id=4 Bye, bearophile
Aug 16 2011
parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 17.08.2011 3:47, bearophile wrote:
 Dmitry Olshansky:

 To get a small no-crap-included beta package see download section of
 https://github.com/blackwhale/FReD for .7zs.


void parseFlags(S)(S flags) { foreach(ch; flags)//flags are ASCII anyway { switch(ch) { foreach(i, op; __traits(allMembers, RegexOption)) { case RegexOptionNames[i]: if(re_flags& mixin("RegexOption."~op)) throw new RegexException(text("redundant flag specified: ",ch)); re_flags |= mixin("RegexOption."~op); break; } default: if(__ctfe) assert(text("unknown regex flag '",ch,"'")); else new RegexException(text("unknown regex flag '",ch,"'")); } To better see the situation I have written a small test case: import std.typetuple: TypeTuple; enum RegexOption : uint { A, B, C } // no need to put a semicolon here alias TypeTuple!(RegexOption.A, RegexOption.B, RegexOption.C) RegexOptionNames; void main() { RegexOption ch; switch (ch) { foreach (i, op; __traits(allMembers, RegexOption)) case RegexOptionNames[i]: break; default: assert(0); } } test.d(12): Error: switch case fallthrough - use 'goto case;' if intended test.d(12): Error: switch case fallthrough - use 'goto case;' if intended test.d(12): Error: switch case fallthrough - use 'goto case;' if intended test.d(14): Error: switch case fallthrough - use 'goto default;' if intended This used to work, I think. The new DMD switch analysis seems to have a bug. -------------

compile with -w, that's when it happens IIRC. I almost forgot about it, thanks for uncovering it again, you may as well file it.
 If you want a benchmark, to compare it with other implementations, there is
this one:
 http://shootout.alioth.debian.org/debian/program.php?test=regexdna&lang=gdc&id=4

very promising. -- Dmitry Olshansky
Aug 17 2011
parent reply bearophile <bearophileHUGS lycos.com> writes:
Dmitry Olshansky:

 Yes, that's a bug. But it's not a regression,

I think it's a DMD regression, probably introduced with the recent changes in switch semantics. DMD 2.042 doesn't have this bug.
 I assume you started to compile with -w,

I suggest Phobos devs to use -w too.
 thanks for uncovering it again, you may as well file it.

OK, I'll add it to Bugzilla. Bye, bearophile
Aug 17 2011
parent bearophile <bearophileHUGS lycos.com> writes:
 thanks for uncovering it again, you may as well file it.

OK, I'll add it to Bugzilla.

http://d.puremagic.com/issues/show_bug.cgi?id=6518
Aug 17 2011
prev sibling parent amanda <maalice19 yahoo.com> writes:
When you have Herpes, HIV/AIDS, hpv,or any other STD, it can feel like you are
all alone in the world  DatingHerpesSingles.com is a place where you didn't
have to worry about being rejected   Just feel free to chat, share stories,
make friends in your local area.
Aug 16 2011