digitalmars.D - [GSOC] regular expressions beta is here

Dmitry Olshansky (29/29) Aug 10 2011 In case I failed to mention it before, I m working on the project

Vladimir Panteleev (7/8) Aug 10 2011 Hi, does this rewrite cover compile-time regex compilation?

Dmitry Olshansky (11/16) Aug 10 2011 Yes, I've dubbed it static regex. In fact it will be something similar

Vladimir Panteleev (8/12) Aug 10 2011 Awesome stuff. D's codegen abilities have the potential to put regex

Jacob Carlborg (7/34) Aug 10 2011 I have a suggestion, make RegexMatch implicitly convertible to bool,

Dmitry Olshansky (14/51) Aug 10 2011 Interesting idea, one problem with it is that I want this:

Steven Schveighoffer (12/57) Aug 10 2011 Without actually looking at the code, why wouldn't something like this

Dmitry Olshansky (4/65) Aug 10 2011 Thanks, I'll give it a try.

Jacob Carlborg (12/26) Aug 10 2011 No, that won't be any problem:

Dmitry Olshansky (8/35) Aug 10 2011 That may be all well, but try writeln on it, what will it print?

Jacob Carlborg (6/33) Aug 10 2011 Oh, I didn't know that it would work implicitly in conditionals. Then

Steven Schveighoffer (7/14) Aug 10 2011 alias this has lots of problems, but it doesn't mean it's *design* is
Andrei Alexandrescu (5/36) Aug 10 2011 That's pretty cool actually because it naturally extends the built-in

Jacob Carlborg (5/9) Aug 10 2011 Cool, I always thought that opCast was for explicit casts, but maybe

Andrei Alexandrescu (4/29) Aug 10 2011 If alias this is any more blunt than regular subtyping (inheritance),

bearophile (7/9) Aug 10 2011 When you write some English text you don't write a single block of text,...

Dmitry Olshansky (16/25) Aug 10 2011 While I haven't asked for review, I do appreciate comments. I have to

bearophile (8/20) Aug 10 2011 Think about reading a book without the half lines between paragraphs. In...

Adam D. Ruppe (5/5) Aug 10 2011 bearophile:

bearophile (4/6) Aug 10 2011 This "you" is a group that includes people like Guido V. Rossum, Rob Pik...
Marco Leise (17/22) Aug 10 2011 I think a blank line makes code easier on the eyes. When you scroll over...

Jonathan M Davis (20/46) Aug 10 2011 This sort of thing has been discussed by the Phobos dev team previously,...

simendsjo (2/4) Aug 10 2011 There is? Parallelism and json uses braces on the same line.

Jonathan M Davis (4/9) Aug 10 2011 It was agreed upon, and where it has been noticed, it has been fixed. Bu...

simendsjo (4/13) Aug 11 2011 Damn - I've been changing my D style to braces on the same line. It's

Jonathan M Davis (5/21) Aug 11 2011 You're free to do your braces however you'd like in your own code, but a...

simendsjo (8/29) Aug 11 2011 I actually like that a language has a "default" style. Java, C# and

Jonathan M Davis (5/41) Aug 11 2011 Well, you're free to follow Phobos' style too. It's entirely up to you. ...

Marco Leise (4/5) Aug 10 2011 You see, and that is why we should make that explicit rather than implic...

Dmitry Olshansky (16/34) Aug 10 2011 Braces *are* paragraphs of code, with proper indention it's more then

Vladimir Panteleev (10/22) Aug 10 2011 I agree with bearophile; I find code that leaves a blank line between

Dmitry Olshansky (5/26) Aug 10 2011 Lucky you, hm... probably turning my monitor on 90 degrees can get me in...

bearophile (25/29) Aug 10 2011 They sometimes are, but inside functions there are other kinds of "parag...

Don (10/18) Aug 10 2011 You're conflating a couple of things here. Invariants are tremendously

Jonathan M Davis (8/11) Aug 10 2011 That would be great, but several bugs need to be fixed before that's pos...
Lutger Blijdestijn (3/26) Aug 10 2011 What about out contracts on interfaces in a library (where you use the

Don (3/29) Aug 10 2011 That involves inheritance. But I don't think there are any cases in

bearophile (27/28) Aug 11 2011 I see three different situations where postconditions are useful in D:

Don (16/55) Aug 11 2011 Sorry, but personally I don't believe that this is useful outside of toy...

Adam D. Ruppe (11/11) Aug 11 2011 If it's worth anything, I use the out contracts in dom.d more as

Marco Leise (46/57) Aug 11 2011 I've been wondering for a while if selective unit tests could be include...

bearophile (38/50) Aug 11 2011 Putting a simpler algorithm in the post-condition implements a third pos...

Don (43/113) Aug 12 2011 Conditions required for this to be true:

Timon Gehr (71/216) Aug 12 2011 If the difference is not an asymptotic one, it can well be time critical...

bearophile (43/51) Aug 12 2011 This code of mine is a real-world example. This is a struct method with ...

Dmitry Olshansky (5/25) Aug 11 2011 I stand corrected about invariants, somehow I wasn't considering them a

Jacob Carlborg (4/9) Aug 10 2011 I always add a blank line before and after statements.

Dmitry Olshansky (19/45) Aug 16 2011 Meanwhile the new beta is up:

bearophile (44/46) Aug 16 2011 I have not patched DMD, but it gives me some problem here:

Dmitry Olshansky (8/52) Aug 17 2011 Yes, that's a bug. But it's not a regression, I assume you started to

bearophile (6/9) Aug 17 2011 I suggest Phobos devs to use -w too.

bearophile (1/4) Aug 17 2011 http://d.puremagic.com/issues/show_bug.cgi?id=6518

amanda (1/1) Aug 16 2011 When you have Herpes, HIV/AIDS, hpv,or any other STD, it can feel like y...

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

     In case I failed to mention it before, I m working on the project 
codenamed FReD that is aimed at ~100%* source level compatible overhaul 
of std.regex, that uses better implementation techniques, provides 
modern Unicode support and common syntax riches.

     I think it's time for a public beta release,  since it _should_ be 
ready for mainstream usage. There are some rough edges, and a couple 
issues that I'm aware of but they are nowhere in realistic use cases.

     In order to avoid unexpected regressions I'd be glad if current 
std.regex users do try it for their projects/tests.
To get a small no-crap-included beta package see download section of 
https://github.com/blackwhale/FReD for .7zs.
I'll upload newer packages as bugs get exposed and fixed. Alternatively, 
if you a comfortable with git you may just git clone entire repo. Some 
helpful notes (same as README) can be found here : 
https://github.com/blackwhale/FReD/wiki/Beta-release

Caveats:
     In order for it compile a tiny change to 2.054 source is needed (no 
need to recompile Phobos! it's only in templates):
patch std.algorithm.cmp according to this diff 
https://github.com/D-Programming-Language/phobos/pull/176/files#L0L4631 
<https://github.com/D-Programming-Language/phobos/pull/176/files#L0L4633>
and to get CTFE features working add if(!__ctfe) listed in the next diff 
on the same webpage.
(this is already upstream, so if you're using a fork of phobos just pull 
this in)

* some API problems might lead to a breaking change, though it didn't 
happen in this release

-- 
Dmitry Olshansky

Aug 10 2011

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Wed, 10 Aug 2011 13:42:25 +0300, Dmitry Olshansky  
<dmitry.olsh gmail.com> wrote:

 and to get CTFE features working add if(!__ctfe) listed in the next diff

Hi, does this rewrite cover compile-time regex compilation?

E.g. regex!`^a` compiling to s.length&&s[0]=='a' or something like that.

-- 
Best regards,
  Vladimir                            mailto:vladimir thecybershadow.net

Aug 10 2011

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 10.08.2011 15:16, Vladimir Panteleev wrote:
 On Wed, 10 Aug 2011 13:42:25 +0300, Dmitry Olshansky 
 <dmitry.olsh gmail.com> wrote:

 and to get CTFE features working add if(!__ctfe) listed in the next diff

 Hi, does this rewrite cover compile-time regex compilation?

 E.g. regex!`^a` compiling to s.length&&s[0]=='a' or something like that.

Yes, I've dubbed it  static regex. In fact it will be something similar 
to this, though it will do a heap allocation for backtracking points, on 
first call to match. Heap allocations are definetly going away in final 
release.
You can pass -version=fred_ct -debug to dmd to see generated programs.
At the moment it's more prof of concept then speed devil, something I 
might see about to change once CTFE bugs worked out. Anyway when it 
doesn't crush the compiler, it's pretty fast :)

-- 
Dmitry Olshansky

Aug 10 2011

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Wed, 10 Aug 2011 14:44:44 +0300, Dmitry Olshansky  
<dmitry.olsh gmail.com> wrote:

 Yes, I've dubbed it  static regex. In fact it will be something similar  
 to this, though it will do a heap allocation for backtracking points, on  
 first call to match. Heap allocations are definetly going away in final  
 release.

Awesome stuff. D's codegen abilities have the potential to put regex  
matching way ahead of any C/C++ libraries that don't JIT or stuff like  
that.

-- 
Best regards,
  Vladimir                            mailto:vladimir thecybershadow.net

Aug 10 2011

Jacob Carlborg <doob me.com> writes:

On 2011-08-10 12:42, Dmitry Olshansky wrote:
 In case I failed to mention it before, I m working on the project
 codenamed FReD that is aimed at ~100%* source level compatible overhaul
 of std.regex, that uses better implementation techniques, provides
 modern Unicode support and common syntax riches.

 I think it's time for a public beta release, since it _should_ be ready
 for mainstream usage. There are some rough edges, and a couple issues
 that I'm aware of but they are nowhere in realistic use cases.

 In order to avoid unexpected regressions I'd be glad if current
 std.regex users do try it for their projects/tests.
 To get a small no-crap-included beta package see download section of
 https://github.com/blackwhale/FReD for .7zs.
 I'll upload newer packages as bugs get exposed and fixed. Alternatively,
 if you a comfortable with git you may just git clone entire repo. Some
 helpful notes (same as README) can be found here :
 https://github.com/blackwhale/FReD/wiki/Beta-release

 Caveats:
 In order for it compile a tiny change to 2.054 source is needed (no need
 to recompile Phobos! it's only in templates):
 patch std.algorithm.cmp according to this diff
 https://github.com/D-Programming-Language/phobos/pull/176/files#L0L4631
 <https://github.com/D-Programming-Language/phobos/pull/176/files#L0L4633>
 and to get CTFE features working add if(!__ctfe) listed in the next diff
 on the same webpage.
 (this is already upstream, so if you're using a fork of phobos just pull
 this in)

 * some API problems might lead to a breaking change, though it didn't
 happen in this release

I have a suggestion, make RegexMatch implicitly convertible to bool, 
indicating if there was a match or not.

Aren't there a lot of things that should be declared as private in the 
fred.d module?

-- 
/Jacob Carlborg

Aug 10 2011

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 10.08.2011 15:34, Jacob Carlborg wrote:
 On 2011-08-10 12:42, Dmitry Olshansky wrote:
 In case I failed to mention it before, I m working on the project
 codenamed FReD that is aimed at ~100%* source level compatible overhaul
 of std.regex, that uses better implementation techniques, provides
 modern Unicode support and common syntax riches.

 I think it's time for a public beta release, since it _should_ be ready
 for mainstream usage. There are some rough edges, and a couple issues
 that I'm aware of but they are nowhere in realistic use cases.

 In order to avoid unexpected regressions I'd be glad if current
 std.regex users do try it for their projects/tests.
 To get a small no-crap-included beta package see download section of
 https://github.com/blackwhale/FReD for .7zs.
 I'll upload newer packages as bugs get exposed and fixed. Alternatively,
 if you a comfortable with git you may just git clone entire repo. Some
 helpful notes (same as README) can be found here :
 https://github.com/blackwhale/FReD/wiki/Beta-release

 Caveats:
 In order for it compile a tiny change to 2.054 source is needed (no need
 to recompile Phobos! it's only in templates):
 patch std.algorithm.cmp according to this diff
 https://github.com/D-Programming-Language/phobos/pull/176/files#L0L4631
 <https://github.com/D-Programming-Language/phobos/pull/176/files#L0L4633> 

 and to get CTFE features working add if(!__ctfe) listed in the next diff
 on the same webpage.
 (this is already upstream, so if you're using a fork of phobos just pull
 this in)

 * some API problems might lead to a breaking change, though it didn't
 happen in this release

 I have a suggestion, make RegexMatch implicitly convertible to bool, 
 indicating if there was a match or not.

Interesting idea, one problem with it is that I want this:

auto m = match("bleh", "bleh");
writeln(m);

to actually print "bleh", not true
Right now due to a carry over bug from std.regex (interface thing) 
writln(m) will just do a stackoverflow, m.hit however works.

 Aren't there a lot of things that should be declared as private in the 
 fred.d module?

Yes, it's a side effect of me having a lot of debugging tool that do 
need these internals. If only package protection attribute of something 
was working....
Not to mention that the whole module should work in SafeD with a couple 
of  trusted here and there.

-- 
Dmitry Olshansky

Aug 10 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Wed, 10 Aug 2011 07:51:32 -0400, Dmitry Olshansky  
<dmitry.olsh gmail.com> wrote:

 On 10.08.2011 15:34, Jacob Carlborg wrote:
 On 2011-08-10 12:42, Dmitry Olshansky wrote:
 In case I failed to mention it before, I m working on the project
 codenamed FReD that is aimed at ~100%* source level compatible overhaul
 of std.regex, that uses better implementation techniques, provides
 modern Unicode support and common syntax riches.

 I think it's time for a public beta release, since it _should_ be ready
 for mainstream usage. There are some rough edges, and a couple issues
 that I'm aware of but they are nowhere in realistic use cases.

 In order to avoid unexpected regressions I'd be glad if current
 std.regex users do try it for their projects/tests.
 To get a small no-crap-included beta package see download section of
 https://github.com/blackwhale/FReD for .7zs.
 I'll upload newer packages as bugs get exposed and fixed.  
 Alternatively,
 if you a comfortable with git you may just git clone entire repo. Some
 helpful notes (same as README) can be found here :
 https://github.com/blackwhale/FReD/wiki/Beta-release

 Caveats:
 In order for it compile a tiny change to 2.054 source is needed (no  
 need
 to recompile Phobos! it's only in templates):
 patch std.algorithm.cmp according to this diff
 https://github.com/D-Programming-Language/phobos/pull/176/files#L0L4631
 <https://github.com/D-Programming-Language/phobos/pull/176/files#L0L4633>  
 and to get CTFE features working add if(!__ctfe) listed in the next  
 diff
 on the same webpage.
 (this is already upstream, so if you're using a fork of phobos just  
 pull
 this in)

 * some API problems might lead to a breaking change, though it didn't
 happen in this release

 I have a suggestion, make RegexMatch implicitly convertible to bool,  
 indicating if there was a match or not.

 Interesting idea, one problem with it is that I want this:

 auto m = match("bleh", "bleh");
 writeln(m);

 to actually print "bleh", not true

Without actually looking at the code, why wouldn't something like this  
work?

struct RegexMatch
{
    ...
    string toString() {...}
    opCast(T : bool)() {...}
}

This isn't an implicit cast, but it will work for conditional statements.

-Steve

Aug 10 2011

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 10.08.2011 16:54, Steven Schveighoffer wrote:
 On Wed, 10 Aug 2011 07:51:32 -0400, Dmitry Olshansky 
 <dmitry.olsh gmail.com> wrote:

 On 10.08.2011 15:34, Jacob Carlborg wrote:
 On 2011-08-10 12:42, Dmitry Olshansky wrote:
 In case I failed to mention it before, I m working on the project
 codenamed FReD that is aimed at ~100%* source level compatible 
 overhaul
 of std.regex, that uses better implementation techniques, provides
 modern Unicode support and common syntax riches.

 I think it's time for a public beta release, since it _should_ be 
 ready
 for mainstream usage. There are some rough edges, and a couple issues
 that I'm aware of but they are nowhere in realistic use cases.

 In order to avoid unexpected regressions I'd be glad if current
 std.regex users do try it for their projects/tests.
 To get a small no-crap-included beta package see download section of
 https://github.com/blackwhale/FReD for .7zs.
 I'll upload newer packages as bugs get exposed and fixed. 
 Alternatively,
 if you a comfortable with git you may just git clone entire repo. Some
 helpful notes (same as README) can be found here :
 https://github.com/blackwhale/FReD/wiki/Beta-release

 Caveats:
 In order for it compile a tiny change to 2.054 source is needed (no 
 need
 to recompile Phobos! it's only in templates):
 patch std.algorithm.cmp according to this diff
 https://github.com/D-Programming-Language/phobos/pull/176/files#L0L4631 

 <https://github.com/D-Programming-Language/phobos/pull/176/files#L0L4633> 
 and to get CTFE features working add if(!__ctfe) listed in the next 
 diff
 on the same webpage.
 (this is already upstream, so if you're using a fork of phobos just 
 pull
 this in)

 * some API problems might lead to a breaking change, though it didn't
 happen in this release

 I have a suggestion, make RegexMatch implicitly convertible to bool, 
 indicating if there was a match or not.

 Interesting idea, one problem with it is that I want this:

 auto m = match("bleh", "bleh");
 writeln(m);

 to actually print "bleh", not true

 Without actually looking at the code, why wouldn't something like this 
 work?

 struct RegexMatch
 {
    ...
    string toString() {...}
    opCast(T : bool)() {...}
 }

 This isn't an implicit cast, but it will work for conditional statements.

Thanks, I'll give it a try.


-- 
Dmitry Olshansky

Aug 10 2011

Jacob Carlborg <doob me.com> writes:

 Interesting idea, one problem with it is that I want this:

 auto m = match("bleh", "bleh");
 writeln(m);

 to actually print "bleh", not true
 Right now due to a carry over bug from std.regex (interface thing)
 writln(m) will just do a stackoverflow, m.hit however works.

No, that won't be any problem:

struct Foo
{
     bool b;
     alias b this;
}

auto f = Foo();
static assert(is(typeof(f) == Foo));

The above assert passes as expected.

 Aren't there a lot of things that should be declared as private in the
 fred.d module?

 Yes, it's a side effect of me having a lot of debugging tool that do
 need these internals. If only package protection attribute of something
 was working....
 Not to mention that the whole module should work in SafeD with a couple
 of  trusted here and there.

Ok, I see.

-- 
/Jacob Carlborg

Aug 10 2011

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 10.08.2011 18:54, Jacob Carlborg wrote:
 Interesting idea, one problem with it is that I want this:

 auto m = match("bleh", "bleh");
 writeln(m);

 to actually print "bleh", not true
 Right now due to a carry over bug from std.regex (interface thing)
 writln(m) will just do a stackoverflow, m.hit however works.

 No, that won't be any problem:

 struct Foo
 {
     bool b;
     alias b this;
 }

 auto f = Foo();
 static assert(is(typeof(f) == Foo));

 The above assert passes as expected.

That may be all well, but  try writeln on it, what will it print?
After some experience with alias this I had to conclude that it's rather 
blunt tool, and I'd rather stay away of it.
Actually I like Steven's opCast suggestion, so that it works in 
conditionals.

 Aren't there a lot of things that should be declared as private in the
 fred.d module?

 Yes, it's a side effect of me having a lot of debugging tool that do
 need these internals. If only package protection attribute of something
 was working....
 Not to mention that the whole module should work in SafeD with a couple
 of  trusted here and there.

 Ok, I see.


-- 
Dmitry Olshansky

Aug 10 2011

Jacob Carlborg <doob me.com> writes:

On 2011-08-10 17:55, Dmitry Olshansky wrote:
 On 10.08.2011 18:54, Jacob Carlborg wrote:
 Interesting idea, one problem with it is that I want this:

 auto m = match("bleh", "bleh");
 writeln(m);

 to actually print "bleh", not true
 Right now due to a carry over bug from std.regex (interface thing)
 writln(m) will just do a stackoverflow, m.hit however works.

 No, that won't be any problem:

 struct Foo
 {
 bool b;
 alias b this;
 }

 auto f = Foo();
 static assert(is(typeof(f) == Foo));

 The above assert passes as expected.

 That may be all well, but try writeln on it, what will it print?

Hmm, it doesn't print anything, I think it looks like a bug in writeln.

 After some experience with alias this I had to conclude that it's rather
 blunt tool, and I'd rather stay away of it.
 Actually I like Steven's opCast suggestion, so that it works in
 conditionals.

Oh, I didn't know that it would work implicitly in conditionals. Then 
I'm happy with opCast :)

-- 
/Jacob Carlborg

Aug 10 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Wed, 10 Aug 2011 12:46:25 -0400, Jacob Carlborg <doob me.com> wrote:

 On 2011-08-10 17:55, Dmitry Olshansky wrote:

 After some experience with alias this I had to conclude that it's rather
 blunt tool, and I'd rather stay away of it.
 Actually I like Steven's opCast suggestion, so that it works in
 conditionals.


alias this has lots of problems, but it doesn't mean it's *design* is  
blunt, just that the implementation of it is not too good.

 Oh, I didn't know that it would work implicitly in conditionals. Then  
 I'm happy with opCast :)

http://www.d-programming-language.org/operatoroverloading.html#Cast

Note that it only works for structs (not sure if that return type is a  
struct or not...)

-Steve

Aug 10 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 8/10/11 10:46 AM, Jacob Carlborg wrote:
 On 2011-08-10 17:55, Dmitry Olshansky wrote:
 On 10.08.2011 18:54, Jacob Carlborg wrote:
 Interesting idea, one problem with it is that I want this:

 auto m = match("bleh", "bleh");
 writeln(m);

 to actually print "bleh", not true
 Right now due to a carry over bug from std.regex (interface thing)
 writln(m) will just do a stackoverflow, m.hit however works.

 No, that won't be any problem:

 struct Foo
 {
 bool b;
 alias b this;
 }

 auto f = Foo();
 static assert(is(typeof(f) == Foo));

 The above assert passes as expected.

 That may be all well, but try writeln on it, what will it print?

 Hmm, it doesn't print anything, I think it looks like a bug in writeln.

 After some experience with alias this I had to conclude that it's rather
 blunt tool, and I'd rather stay away of it.
 Actually I like Steven's opCast suggestion, so that it works in
 conditionals.

 Oh, I didn't know that it would work implicitly in conditionals. Then
 I'm happy with opCast :)

That's pretty cool actually because it naturally extends the built-in 
approach. When you do e.g. if (pointer) that's really equivalent to if 
(cast(bool) pointer) and so on.

Andrei

Aug 10 2011

Jacob Carlborg <doob me.com> writes:

On 2011-08-10 19:45, Andrei Alexandrescu wrote:
 That's pretty cool actually because it naturally extends the built-in
 approach. When you do e.g. if (pointer) that's really equivalent to if
 (cast(bool) pointer) and so on.

 Andrei

Cool, I always thought that opCast was for explicit casts, but maybe 
it's explicit in this case, in some way.

-- 
/Jacob Carlborg

Aug 10 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 8/10/11 9:55 AM, Dmitry Olshansky wrote:
 On 10.08.2011 18:54, Jacob Carlborg wrote:
 Interesting idea, one problem with it is that I want this:

 auto m = match("bleh", "bleh");
 writeln(m);

 to actually print "bleh", not true
 Right now due to a carry over bug from std.regex (interface thing)
 writln(m) will just do a stackoverflow, m.hit however works.

 No, that won't be any problem:

 struct Foo
 {
 bool b;
 alias b this;
 }

 auto f = Foo();
 static assert(is(typeof(f) == Foo));

 The above assert passes as expected.

 That may be all well, but try writeln on it, what will it print?
 After some experience with alias this I had to conclude that it's rather
 blunt tool, and I'd rather stay away of it.

If alias this is any more blunt than regular subtyping (inheritance), 
that would be a bug. Feel free to submit if you find such issues.

Andrei

Aug 10 2011

bearophile <bearophileHUGS lycos.com> writes:

Dmitry Olshansky:

 To get a small no-crap-included beta package see download section of 
 https://github.com/blackwhale/FReD for .7zs.

When you write some English text you don't write a single block of text, you
organize it into paragraphs, and paragraphs into chapters, chapters into
sections, sections into books, etc. Time ago I have understood that paragraphs
are very good in source code too.

So I suggest you to add a blank line here and there inside your functions to
separate them into paragraphs. I can't give you a style rule, you will need to
create your own style, but often a function that's more than 10 lines line long
needs one or more blank lines inside (some people say that every time you see
one of such paragraphs in a function, especially if it has a comment before it,
then you need to perform an "extract method" to improve the code. I believe
this is a bad advice).

I see no contracts in the code (I mean the ones with assert inside, instead of
enforce). I suggest Walter to fix this situation. One idea is to include two
versions of Phobos lib in the zip of the dmd distribution, one with asserts
compiled in and one without, and let DMD import from the correct library
according to the compilation flags.

Some solution to this problem is getting urgent, because Phobos is growing
without the use of one of the nicest features of D (contract programming).
Solving this problem is more urgent than having an excellent regex library in
Phobos. If people don't use contract programming much, is because you can't use
it in Phobos.

Bye,
bearophile

Aug 10 2011

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 10.08.2011 20:02, bearophile wrote:
 Dmitry Olshansky:

 To get a small no-crap-included beta package see download section of
 https://github.com/blackwhale/FReD for .7zs.

 When you write some English text you don't write a single block of text, you
organize it into paragraphs, and paragraphs into chapters, chapters into
sections, sections into books, etc. Time ago I have understood that paragraphs
are very good in source code too.

 So I suggest you to add a blank line here and there inside your functions to
separate them into paragraphs. I can't give you a style rule, you will need to
create your own style, but often a function that's more than 10 lines line long
needs one or more blank lines inside (some people say that every time you see
one of such paragraphs in a function, especially if it has a comment before it,
then you need to perform an "extract method" to improve the code. I believe
this is a bad advice).

While I haven't asked for review, I do appreciate  comments. I have to 
say I did no cleanup or otherwise shape up the code, I'm still working 
on semantic side part of problems:)
Honestly I can't get why you are so nervous about code style anyway, you 
seem to bring this up way to often.
About spaces personally I dislike eating extra vertical space for 
"clarity", curly braces on it's own line is already way too much.

 I see no contracts in the code (I mean the ones with assert inside, instead of
enforce). I suggest Walter to fix this situation. One idea is to include two
versions of Phobos lib in the zip of the dmd distribution, one with asserts
compiled in and one without, and let DMD import from the correct library
according to the compilation flags.
 Some solution to this problem is getting urgent, because Phobos is growing
without the use of one of the nicest features of D (contract programming).
Solving this problem is more urgent than having an excellent regex library in
Phobos. If people don't use contract programming much, is because you can't use
it in Phobos.

Have to respectfully disagree on this, don't try to nail everything on 
contracts. They are nice but have little value over plain assert 
_unless_ we are talking about classes and _inheritance_, which isn't the 
case here. And there are lots of asserts here, but much more of input is 
enforced since it's totally expected to supply wrong pattern (or have an 
outside  user to type in the pattern).

 Bye,
 bearophile


-- 
Dmitry Olshansky

Aug 10 2011

bearophile <bearophileHUGS lycos.com> writes:

Dmitry Olshansky:

 Honestly I can't get why you are so nervous about code style anyway, you 
 seem to bring this up way to often.

I bring it often because many D programmers seem half blind to this problem. I
am not willing to go to the extremes Go language goes to solve this problem,
but I'd like more recognition of this problem in D programmers. A bit more
common style is quite helpful to create an ecology of D programmers that share
single modules. I guess D programmers are used to C/C++ languages, where there
are not modules and where programs are usually made of many files. So they
don't see why sharing single modules in the pool is so useful.


 About spaces personally I dislike eating extra vertical space for 
 "clarity", curly braces on it's own line is already way too much.

Think about reading a book without the half lines between paragraphs. In code
it's the same. Some empty lines are good to improve readability of the code.
Curly braces are not always present, sometimes a paragraphs ends before or
after or right on a curly brace.


 Have to respectfully disagree on this, don't try to nail everything on 
 contracts.

Contracts don't replace unittests, they complement each other.


 They are nice but have little value over plain assert 
 _unless_ we are talking about classes and _inheritance_, which isn't the 
 case here.

It's easy to forget to test the output of a function, the "out" contracts help
here. In structs the invariant helps you avoid forgetting to call manually a
sanity test function every time you come in and out of a method.


 And there are lots of asserts here, but much more of input is 
 enforced since it's totally expected to supply wrong pattern (or have an 
 outside  user to type in the pattern).

The idea is to replace those enforces with asserts, and allow user programs to
import Phobos stuff that still contain asserts (from a secondary Phobos lib).
Enforces are for certain kinds of user code, I don't think they are fit in
Phobos.

Bye,
bearophile

Aug 10 2011

Adam D. Ruppe <destructionator gmail.com> writes:

bearophile:

The thing is just because you call it a problem a lot doesn't mean
everyone else sees it that way.

A lot of us have many years of experience and just don't see it the
same way you do.

Aug 10 2011

bearophile <bearophileHUGS lycos.com> writes:

Adam D. Ruppe:

 A lot of us have many years of experience and just don't see it the
 same way you do.

This "you" is a group that includes people like Guido V. Rossum, Rob Pike, Ken
Thompson and R. Hettinger (they have feelings even stronger than mine on this
topic).

Bye,
bearophile

Aug 10 2011

"Marco Leise" <Marco.Leise gmx.de> writes:

Am 10.08.2011, 19:24 Uhr, schrieb Adam D. Ruppe  
<destructionator gmail.com>:

 bearophile:

 The thing is just because you call it a problem a lot doesn't mean
 everyone else sees it that way.

 A lot of us have many years of experience and just don't see it the
 same way you do.

I think a blank line makes code easier on the eyes. When you scroll over  
it you recognize easily where you are from the size and shape of the  
paragraphs. So I totally understand that. On the other hand my laptop  
screen is 1280x800 and I also feel that sometimes I think I scroll over  
the end of a function body when there is just a blank line in a block of  
code. So usually I go with the approach of inserting a comment line  
instead of a blank line, which is usually italic and in a brighter color.
If I was working on a Phobos module I would try to mime existing code  
style (and probably find out that there is no common style :p ). Anyway  
such things can be up to a vote just like the idea to not use single  
capital letters only for template type placeholders (i.e. T, S).
Google's code style wiki is nice. It lists all the rules and also offers  
an explanation. We can have that for Phobos, too. So topics like these  
don't come up over and over again. The D style guide is a good start:  
http://www.digitalmars.com/d/2.0/dstyle.html

Aug 10 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Wednesday, August 10, 2011 21:42:01 Marco Leise wrote:
 Am 10.08.2011, 19:24 Uhr, schrieb Adam D. Ruppe
 
 <destructionator gmail.com>:
 bearophile:
 
 The thing is just because you call it a problem a lot doesn't mean
 everyone else sees it that way.
 
 A lot of us have many years of experience and just don't see it the
 same way you do.

 
 I think a blank line makes code easier on the eyes. When you scroll over
 it you recognize easily where you are from the size and shape of the
 paragraphs. So I totally understand that. On the other hand my laptop
 screen is 1280x800 and I also feel that sometimes I think I scroll over
 the end of a function body when there is just a blank line in a block of
 code. So usually I go with the approach of inserting a comment line
 instead of a blank line, which is usually italic and in a brighter color.
 If I was working on a Phobos module I would try to mime existing code
 style (and probably find out that there is no common style :p ). Anyway
 such things can be up to a vote just like the idea to not use single
 capital letters only for template type placeholders (i.e. T, S).
 Google's code style wiki is nice. It lists all the rules and also offers
 an explanation. We can have that for Phobos, too. So topics like these
 don't come up over and over again. The D style guide is a good start:
 http://www.digitalmars.com/d/2.0/dstyle.html

This sort of thing has been discussed by the Phobos dev team previously, and 
the general consensus was not to enforce much in the way of formatting in a 
style guide. There a few things that were agreed upon (such as always putting 
braces on their own line), but on the whole, the style guide is supposed to 
focus on the API (so, things like function and variable names) rather than how 
code is formatted. I have an update to the style guide as a pull request which 
is currently being reviewed to make sure that the style guide on the site is 
in line with what we do:

https://github.com/D-Programming-Language/d-programming-language.org/pull/16

But I'm certain that you're not going to get the Phobos devs to agree on a 
style guide like Bearophile wants. And honestly, I'm a bit tired of the topic 
coming up. The does need some updates, but it's mostly correct. It's 
essentially what we've decided on, and I don't see any reason to keep 
discussing it over and over.

Personally, I'd prefer that Dmitry had more blank lines in his code, but it's 
up to him how he does that as long as his code falls within the rules set down 
by the D style guide. And for any of his code which isn't going into Phobos, 
it's completely up to him how to format it.

- Jonathan M Davis

Aug 10 2011

simendsjo <simendsjo gmail.com> writes:

On 10.08.2011 22:12, Jonathan M Davis wrote:
 There a few things that were agreed upon (such as always putting
 braces on their own line),

There is? Parallelism and json uses braces on the same line.

Aug 10 2011

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

 On 10.08.2011 22:12, Jonathan M Davis wrote:
 There a few things that were agreed upon (such as always putting
 braces on their own line),

 
 There is? Parallelism and json uses braces on the same line.

It was agreed upon, and where it has been noticed, it has been fixed. But as I 
said, the style guide needs updating on a few points. Braces on their own line 
is one of them.

- Jonathan M Davis

Aug 10 2011

simendsjo <simendsjo gmail.com> writes:

On 10.08.2011 23:16, Jonathan M Davis wrote:
 On 10.08.2011 22:12, Jonathan M Davis wrote:
 There a few things that were agreed upon (such as always putting
 braces on their own line),

 There is? Parallelism and json uses braces on the same line.

 It was agreed upon, and where it has been noticed, it has been fixed. But as I
 said, the style guide needs updating on a few points. Braces on their own line
 is one of them.

 - Jonathan M Davis

Damn - I've been changing my D style to braces on the same line. It's 
great as I do most D coding on a small laptop. Guess I'll have to change 
it again :)

Aug 11 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Thursday, August 11, 2011 10:50:41 simendsjo wrote:
 On 10.08.2011 23:16, Jonathan M Davis wrote:
 On 10.08.2011 22:12, Jonathan M Davis wrote:
 There a few things that were agreed upon (such as always putting
 braces on their own line),

 
 There is? Parallelism and json uses braces on the same line.

 
 It was agreed upon, and where it has been noticed, it has been fixed.
 But as I said, the style guide needs updating on a few points. Braces
 on their own line is one of them.
 
 - Jonathan M Davis

 
 Damn - I've been changing my D style to braces on the same line. It's
 great as I do most D coding on a small laptop. Guess I'll have to change
 it again :)

You're free to do your braces however you'd like in your own code, but any 
code submitted to Phobos or druntime needs to have the braces on their own 
line.

- Jonathan M Davis

Aug 11 2011

simendsjo <simendsjo gmail.com> writes:

On 11.08.2011 11:04, Jonathan M Davis wrote:
 On Thursday, August 11, 2011 10:50:41 simendsjo wrote:
 On 10.08.2011 23:16, Jonathan M Davis wrote:
 On 10.08.2011 22:12, Jonathan M Davis wrote:
 There a few things that were agreed upon (such as always putting
 braces on their own line),

 There is? Parallelism and json uses braces on the same line.

 It was agreed upon, and where it has been noticed, it has been fixed.
 But as I said, the style guide needs updating on a few points. Braces
 on their own line is one of them.

 - Jonathan M Davis

 Damn - I've been changing my D style to braces on the same line. It's
 great as I do most D coding on a small laptop. Guess I'll have to change
 it again :)

 You're free to do your braces however you'd like in your own code, but any
 code submitted to Phobos or druntime needs to have the braces on their own
 line.

 - Jonathan M Davis


Python all has a default style that makes code easy to read regardless 
of who wrote it (of course, python has some enforced stuff with 
indentation). You can, for instance, break the style as much as you'd 


But then again.. Unless it's written in an obfuscated style, it doesn't 
really matter that much..

Aug 11 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Thursday, August 11, 2011 11:34:30 simendsjo wrote:
 On 11.08.2011 11:04, Jonathan M Davis wrote:
 On Thursday, August 11, 2011 10:50:41 simendsjo wrote:
 On 10.08.2011 23:16, Jonathan M Davis wrote:
 On 10.08.2011 22:12, Jonathan M Davis wrote:
 There a few things that were agreed upon (such as always putting
 braces on their own line),

 
 There is? Parallelism and json uses braces on the same line.

 
 It was agreed upon, and where it has been noticed, it has been
 fixed.
 But as I said, the style guide needs updating on a few points.
 Braces
 on their own line is one of them.
 
 - Jonathan M Davis

 
 Damn - I've been changing my D style to braces on the same line. It's
 great as I do most D coding on a small laptop. Guess I'll have to
 change
 it again :)

 
 You're free to do your braces however you'd like in your own code, but
 any code submitted to Phobos or druntime needs to have the braces on
 their own line.
 
 - Jonathan M Davis

 

 Python all has a default style that makes code easy to read regardless
 of who wrote it (of course, python has some enforced stuff with
 indentation). You can, for instance, break the style as much as you'd

 
 But then again.. Unless it's written in an obfuscated style, it doesn't
 really matter that much..

Well, you're free to follow Phobos' style too. It's entirely up to you. But 
bracing style is the sort of thing that's likely to vary quite a bit from 
programmer to programmer (especially among those with a C or C++ background).

- Jonathan M Davis

Aug 11 2011

"Marco Leise" <Marco.Leise gmx.de> writes:

Am 10.08.2011, 22:12 Uhr, schrieb Jonathan M Davis <jmdavisProg gmx.com>:

 [...] I don't see any reason to keep discussing it over and over.

You see, and that is why we should make that explicit rather than implicit  
in the style guide. An additional point "personal preference" could list  
"blank lines to group logical blocks of code".

Aug 10 2011

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 10.08.2011 21:11, bearophile wrote:
 Dmitry Olshansky:

 Honestly I can't get why you are so nervous about code style anyway, you
 seem to bring this up way to often.

 I bring it often because many D programmers seem half blind to this problem. I
am not willing to go to the extremes Go language goes to solve this problem,
but I'd like more recognition of this problem in D programmers. A bit more
common style is quite helpful to create an ecology of D programmers that share
single modules. I guess D programmers are used to C/C++ languages, where there
are not modules and where programs are usually made of many files. So they
don't see why sharing single modules in the pool is so useful.


 About spaces personally I dislike eating extra vertical space for
 "clarity", curly braces on it's own line is already way too much.

 Think about reading a book without the half lines between paragraphs. In code
it's the same. Some empty lines are good to improve readability of the code.
Curly braces are not always present, sometimes a paragraphs ends before or
after or right on a curly brace.

Braces *are* paragraphs of code, with proper indention it's more then 
enough to fell the structure. If I really need to stop in the middle 
function, it's to explain something, then a single line of comment 
instead of meaningless empty line (which leaves reader clueless as to 
why) is good enough. Except that I'm not opposed to spaces at global scope.

 Have to respectfully disagree on this, don't try to nail everything on
 contracts.

 Contracts don't replace unittests, they complement each other.

unittest != assert, though the former do contain asserts.
 They are nice but have little value over plain assert
 _unless_ we are talking about classes and _inheritance_, which isn't the
 case here.

 It's easy to forget to test the output of a function, the "out" contracts help
here. In structs the invariant helps you avoid forgetting to call manually a
sanity test function every time you come in and out of a method.


 And there are lots of asserts here, but much more of input is
 enforced since it's totally expected to supply wrong pattern (or have an
 outside  user to type in the pattern).

 The idea is to replace those enforces with asserts, and allow user programs to
import Phobos stuff that still contain asserts (from a secondary Phobos lib).
Enforces are for certain kinds of user code, I don't think they are fit in
Phobos.

No gonna work, file I/O is certainly in Phobos, as are network sockets, 
etc. You can't assert that something external won't fail. While you'd 
normally assert on your local logical invariants. As for other things I 
thought e.g. ranges are already hooked on asserts, as much as other 
templates. If you have a list of modules where you find the lack of 
compiled in contracts/asserts unbearable, do tell.

I hate being drugged in these discussions, but just can't resist.

-- 
Dmitry Olshansky

Aug 10 2011

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Wed, 10 Aug 2011 20:59:27 +0300, Dmitry Olshansky  
<dmitry.olsh gmail.com> wrote:

 About spaces personally I dislike eating extra vertical space for
 "clarity", curly braces on it's own line is already way too much.

 Think about reading a book without the half lines between paragraphs.  
 In code it's the same. Some empty lines are good to improve readability  
 of the code. Curly braces are not always present, sometimes a  
 paragraphs ends before or after or right on a curly brace.

 Braces *are* paragraphs of code, with proper indention it's more then  
 enough to fell the structure. If I really need to stop in the middle  
 function, it's to explain something, then a single line of comment  
 instead of meaningless empty line (which leaves reader clueless as to  
 why) is good enough. Except that I'm not opposed to spaces at global  
 scope.

I agree with bearophile; I find code that leaves a blank line between  
closely-related lines make the code much more readable. I don't understand  
what's with the craving for maximum vertical terseness either, but that  
may be because the resolution of my primary monitor is currently 1200x1920  
:)

-- 
Best regards,
  Vladimir                            mailto:vladimir thecybershadow.net

Aug 10 2011

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 10.08.2011 22:11, Vladimir Panteleev wrote:
 On Wed, 10 Aug 2011 20:59:27 +0300, Dmitry Olshansky 
 <dmitry.olsh gmail.com> wrote:

 About spaces personally I dislike eating extra vertical space for
 "clarity", curly braces on it's own line is already way too much.

 Think about reading a book without the half lines between 
 paragraphs. In code it's the same. Some empty lines are good to 
 improve readability of the code. Curly braces are not always 
 present, sometimes a paragraphs ends before or after or right on a 
 curly brace.

 Braces *are* paragraphs of code, with proper indention it's more then 
 enough to fell the structure. If I really need to stop in the middle 
 function, it's to explain something, then a single line of comment 
 instead of meaningless empty line (which leaves reader clueless as to 
 why) is good enough. Except that I'm not opposed to spaces at global 
 scope.

 I agree with bearophile; I find code that leaves a blank line between 
 closely-related lines make the code much more readable. I don't 
 understand what's with the craving for maximum vertical terseness 
 either, but that may be because the resolution of my primary monitor 
 is currently 1200x1920 :)

Lucky you, hm... probably turning my monitor on 90 degrees can get me in 
this league of abundant vertical space :)

-- 
Dmitry Olshansky

Aug 10 2011

bearophile <bearophileHUGS lycos.com> writes:

Dmitry Olshansky:

 Braces *are* paragraphs of code,

They sometimes are, but inside functions there are other kinds of "paragraphs".

As an example, this is first-quality C code (partially written by R. Hettinger):
http://hg.python.org/cpython/file/d5b274a0b0a5/Modules/_collectionsmodule.c

If you take a random function from that page, like:

653 static int
654 deque_del_item(dequeobject *deque, Py_ssize_t i)
655 {
656     PyObject *item;
657
658     assert (i >= 0 && i < deque->len);
659     if (_deque_rotate(deque, -i) == -1)
660         return -1;
661
662     item = deque_popleft(deque, NULL);
663     assert (item != NULL);
664     Py_DECREF(item);
665
666     return _deque_rotate(deque, i);
667 }

You see a blank line after "Py_DECREF(item);" despite there is no closing
brace. The purpose of those blank lines is to help the person that reads the
code to tell apart the various things done by that function. This is C code is
well written.


 No gonna work, file I/O is certainly in Phobos, as are network sockets, 
 etc. You can't assert that something external won't fail.

OK.


 I hate being drugged in these discussions, but just can't resist.

I am sorry, but thank you for answering :-)

Bye,
bearophile

Aug 10 2011

Don <nospam nospam.com> writes:

bearophile wrote:
 Contracts don't replace unittests, they complement each other.
 
 
 They are nice but have little value over plain assert 
 _unless_ we are talking about classes and _inheritance_, which isn't the 
 case here.

 
 It's easy to forget to test the output of a function, the "out" contracts help
here. In structs the invariant helps you avoid forgetting to call manually a
sanity test function every time you come in and out of a method.

You're conflating a couple of things here. Invariants are tremendously 
helpful for structs as well as classes.
"out" contracts seem to be almost useless, unless you have a theorem 
prover. The reason is, that they test nothing apart from the function 
they are attached to, and it's much better to do that with unittesting.
They have very little in common with 'in' contracts.

I think that EVERY struct and class in Phobos should have an invariant 
(except for something like Complex, where there are no invalid values).
But I don't think 'out' contracts would add much value at all.

Aug 10 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Thursday, August 11, 2011 06:58:51 Don wrote:
 I think that EVERY struct and class in Phobos should have an invariant
 (except for something like Complex, where there are no invalid values).
 But I don't think 'out' contracts would add much value at all.

That would be great, but several bugs need to be fixed before that's possible, 
including

http://d.puremagic.com/issues/show_bug.cgi?id=1251
http://d.puremagic.com/issues/show_bug.cgi?id=5039
http://d.puremagic.com/issues/show_bug.cgi?id=5058
http://d.puremagic.com/issues/show_bug.cgi?id=5500

- Jonathan M Davis

Aug 10 2011

Lutger Blijdestijn <lutger.blijdestijn gmail.com> writes:

Don wrote:

 bearophile wrote:
 Contracts don't replace unittests, they complement each other.
 
 
 They are nice but have little value over plain assert
 _unless_ we are talking about classes and _inheritance_, which isn't the
 case here.

 
 It's easy to forget to test the output of a function, the "out" contracts
 help here. In structs the invariant helps you avoid forgetting to call
 manually a sanity test function every time you come in and out of a
 method.

 
 You're conflating a couple of things here. Invariants are tremendously
 helpful for structs as well as classes.
 "out" contracts seem to be almost useless, unless you have a theorem
 prover. The reason is, that they test nothing apart from the function
 they are attached to, and it's much better to do that with unittesting.
 They have very little in common with 'in' contracts.
 
 I think that EVERY struct and class in Phobos should have an invariant
 (except for something like Complex, where there are no invalid values).
 But I don't think 'out' contracts would add much value at all.

What about out contracts on interfaces in a library (where you use the 
library by implementing them).

Aug 10 2011

Don <nospam nospam.com> writes:

Lutger Blijdestijn wrote:
 Don wrote:
 
 bearophile wrote:
 Contracts don't replace unittests, they complement each other.


 They are nice but have little value over plain assert
 _unless_ we are talking about classes and _inheritance_, which isn't the
 case here.

 It's easy to forget to test the output of a function, the "out" contracts
 help here. In structs the invariant helps you avoid forgetting to call
 manually a sanity test function every time you come in and out of a
 method.

 You're conflating a couple of things here. Invariants are tremendously
 helpful for structs as well as classes.
 "out" contracts seem to be almost useless, unless you have a theorem
 prover. The reason is, that they test nothing apart from the function
 they are attached to, and it's much better to do that with unittesting.
 They have very little in common with 'in' contracts.

 I think that EVERY struct and class in Phobos should have an invariant
 (except for something like Complex, where there are no invalid values).
 But I don't think 'out' contracts would add much value at all.

 
 What about out contracts on interfaces in a library (where you use the 
 library by implementing them).

That involves inheritance. But I don't think there are any cases in 
Phobos where that is currently applicable.

Aug 10 2011

bearophile <bearophileHUGS lycos.com> writes:

Don:

"out" contracts seem to be almost useless, unless you have a theorem prover.
The reason is, that they test nothing apart from the function they are attached
to, and it's much better to do that with unittesting.<

I see three different situations where postconditions are useful in D:

1) Sometimes the result of your function/method must satisfy some simple
condition to be correct.

As example, it must be a nonnegative number. Then you add assert(result >= 0,
"..."); in the out. For a Phobos example, std.algorithm.countUntil
postcondition is allowed to test assert(result >= -1, "...");

Other possible conditions are the output string can't be longer than a certain
amount (like longer than the input string), and so on.

In certain cases the program the finds the solution is slow, but testing the
correctness of a function is fast.   I have hit many situations like this. 
As an example you test if the result of a complex sorting algorithm is ordered,
and with the same length of the input (but maybe you don't test for the output
items to be the same of the input).


2) I have found many situations where I am able to solve a problem with both a
simple and slow brute force solver, and a complex and fast algorithm to solve a
problem. The little program maybe is too much slow for normal usage, but it's
just few lines long (especially if I use lot of std.algorithm stuff) but it's
much less likely to contain bugs.
You can't always verify the result of the fast algorithm with the slow
algorithm, this is not useful.
In such situations I write the postcondition like this:

in {
    // ...
} 
out(result) {
    // some fast postconditon tests here
 
    debug {
        assert(result == slowAlgorithm(input));
    }
body {
    // fast algorithm here
}


This way, in release mode it tests nothing, in nonrelease build it tests the
fast postconditions, and in debug mode it also verifies the fast algorithm
gives the same results as the slow algorithm. Generally solving a problem in
two quite different ways helps catch problems in the algorithms.


3) When D will get the prestate ("old" in some contract programming
implementations), I will be able to use the prestate inside the postcondition
to verify better than the function/method has changed the globals, or instance
attributes in a correct way. You can't put such tests in the class/struct
invariant, or in the precondition.


I'm using postconditions often in my code (less often than preconditions, but
often enough). A theorem prover is not strictly necessary for them to be useful.

Bye,
bearophile

Aug 11 2011

Don <nospam nospam.com> writes:

bearophile wrote:
 Don:
 
 "out" contracts seem to be almost useless, unless you have a theorem prover.
The reason is, that they test nothing apart from the function they are attached
to, and it's much better to do that with unittesting.<

 
 I see three different situations where postconditions are useful in D:
 
 1) Sometimes the result of your function/method must satisfy some simple
condition to be correct.
 
 As example, it must be a nonnegative number. Then you add assert(result >= 0,
"..."); in the out. For a Phobos example, std.algorithm.countUntil
postcondition is allowed to test assert(result >= -1, "...");
 
 Other possible conditions are the output string can't be longer than a certain
amount (like longer than the input string), and so on.
 
 In certain cases the program the finds the solution is slow, but testing the
correctness of a function is fast.   I have hit many situations like this. 
 As an example you test if the result of a complex sorting algorithm is
ordered, and with the same length of the input (but maybe you don't test for
the output items to be the same of the input).
 
 
 2) I have found many situations where I am able to solve a problem with both a
simple and slow brute force solver, and a complex and fast algorithm to solve a
problem. The little program maybe is too much slow for normal usage, but it's
just few lines long (especially if I use lot of std.algorithm stuff) but it's
much less likely to contain bugs.

Sorry, but personally I don't believe that this is useful outside of toy 
examples.
The question is, what bugs does it find that aren't found by a trivial 
unit test?

 You can't always verify the result of the fast algorithm with the slow
algorithm, this is not useful.
 In such situations I write the postcondition like this:
 
 in {
     // ...
 } 
 out(result) {
     // some fast postconditon tests here
  
     debug {
         assert(result == slowAlgorithm(input));
     }
 body {
     // fast algorithm here
 }
 
 
 This way, in release mode it tests nothing, in nonrelease build it tests the
fast postconditions, and in debug mode it also verifies the fast algorithm
gives the same results as the slow algorithm. Generally solving a problem in
two quite different ways helps catch problems in the algorithms.
 
 
 3) When D will get the prestate ("old" in some contract programming
implementations), I will be able to use the prestate inside the postcondition
to verify better than the function/method has changed the globals, or instance
attributes in a correct way. You can't put such tests in the class/struct
invariant, or in the precondition.

There are two cases:
(1) it's a very tight test. In which case, it's essentially a unit test.
or (2) it's a very loose test. In which case, it doesn't find bugs.


 I'm using postconditions often in my code (less often than preconditions, but
often enough). A theorem prover is not strictly necessary for them to be useful.

I would like to see an example of a good postcondition.
The crucial feature is, they do NOTHING except find bugs in the function 
they are attached to. So it's very difficult to invent a plausible one.
For starters, it really needs to be a function with multiple return 
values. Otherwise, you can just stick asserts just before your return 
statement, and you don't need __old or any such thing.
Under what circumstances are they are more valuable than any other 
assert inside a function?

Aug 11 2011

Adam D. Ruppe <destructionator gmail.com> writes:

If it's worth anything, I use the out contracts in dom.d more as
checked documentation than for serious bug-finding.

For example:

Element appendChild(Element newChild)
out (ret) { assert(ret is newChild); }
body { ... }

I also use it from time to time to assert that a return value is not
null. The check itself isn't particularly useful, but I think it's
a nice bit of documentation.

Actually, IMO, in and out contracts should be in the generated
ddoc too.

Aug 11 2011

"Marco Leise" <Marco.Leise gmx.de> writes:

Am 11.08.2011, 19:56 Uhr, schrieb Adam D. Ruppe  
<destructionator gmail.com>:

 If it's worth anything, I use the out contracts in dom.d more as
 checked documentation than for serious bug-finding.

 For example:

 Element appendChild(Element newChild)
 out (ret) { assert(ret is newChild); }
 body { ... }

 I also use it from time to time to assert that a return value is not
 null. The check itself isn't particularly useful, but I think it's
 a nice bit of documentation.

 Actually, IMO, in and out contracts should be in the generated
 ddoc too.

I've been wondering for a while if selective unit tests could be included  
in DDOC somehow. Most of the 'examples' in the Phobos documentation look  
like they were taken right out of a unittest block blow the function. Like  
BinaryHeap in std.containers:

----------------------------------------------------------------------

DDOC:

// Example from "Introduction to Algorithms" Cormen et al, p 146
int[] a = [ 4, 1, 3, 2, 16, 9, 10, 14, 8, 7 ];
auto h = heapify(a);
// largest element
assert(h.front == 16);
// a has the heap property
assert(equal(a, [ 16, 14, 10, 9, 8, 7, 4, 3, 2, 1 ]));

----------------------------------------------------------------------

std/containers.d:

unittest
{
     {
         // example from "Introduction to Algorithms" Cormen et al., p 146
         int[] a = [ 4, 1, 3, 2, 16, 9, 10, 14, 8, 7 ];
         auto h = heapify(a);
         assert(h.front == 16);
         assert(a == [ 16, 14, 10, 8, 7, 9, 3, 2, 4, 1 ]);
         auto witness = [ 16, 14, 10, 9, 8, 7, 4, 3, 2, 1 ];
         for (; !h.empty; h.removeFront(), witness.popFront())
         {
             assert(!witness.empty);
             assert(witness.front == h.front);
         }
         assert(witness.empty);
     }
     {
         int[] a = [ 4, 1, 3, 2, 16, 9, 10, 14, 8, 7 ];
         int[] b = new int[a.length];
         BinaryHeap!(int[]) h = BinaryHeap!(int[])(b, 0);
         foreach (e; a)
         {
             h.insert(e);
         }
         assert(b == [ 16, 14, 10, 8, 7, 3, 9, 1, 4, 2 ], text(b));
     }
}

----------------------------------------------------------------------

bearophile, you are the expert with the DRY buzz word ;)

Aug 11 2011

bearophile <bearophileHUGS lycos.com> writes:

Don:

 2) I have found many situations where I am able to solve a problem with both a
simple and slow brute force solver, and a complex and fast algorithm to solve a
problem. The little program maybe is too much slow for normal usage, but it's
just few lines long (especially if I use lot of std.algorithm stuff) but it's
much less likely to contain bugs.


 Sorry, but personally I don't believe that this is useful outside of toy
examples.
 The question is, what bugs does it find that aren't found by a trivial unit
test?

 There are two cases:
 (1) it's a very tight test. In which case, it's essentially a unit test.
 or (2) it's a very loose test. In which case, it doesn't find bugs.

Putting a simpler algorithm in the post-condition implements a third
possibility you are missing.

Usually unit tests verify some specific cases (you are also able to add generic
testing code in the unit test, but this is just like moving the postcondition
elsewhere).

If you put an alternative algorithm in the postcondition (under debug{} if you
want), you have some advantages:
- It's tight, because the second algorithm is supposed to always give the same
results as the function.
- It works with the real examples the program is run too, not just the cases
you have put in the unit test. Sometimes you forget to add certain cases in the
unittests. Putting the test in the postcondition makes sure it always run, for
all the inputs your function is run on (unless you disable it), so you will
catch the cases you didn't think of in the unittests.


 The crucial feature is, they do NOTHING except find bugs in the function
 they are attached to.

In Eiffel you have the prestate too (the old), so the postcondition is the only
place where such information is usable. I hope prestate will be added to D DbC,
because it's a majob sub-feature of DbC. But I don't agree that postconditions
are useless in D.


 For starters, it really needs to be a function with multiple return
 values. Otherwise, you can just stick asserts just before your return
 statement, and you don't need __old or any such thing.

If a function has multiple return values the out(result) helps make sure all
the return paths are verified.

If the function has only one return value it helps anyway, because it helps you
not forget to verify the result.


 Under what circumstances are they are more valuable than any other assert
inside a function?

I have already given some answers. Another answer is this:


int foo(int x)
in {
    // ...
}
out(result) {
    auto y = computeSomething(result);
    assert(y ...);
    assert(y ...);
}
body {
    // ...
}

The out{} helps you organize your code, separating the tests of the body from
the postcondition tests. Also in the postcondition you are allowed to define
new variables and call things. All this out(){} code vanishes in release mode.
Ho do you do that with just asserts inside the body?


If you do this the asserts will vanish in release mode, but the y will be
computed still, wasting computations (a smart compiler is able to see y is not
used and etc, but it's not sure this optimization happens if the computation of
y is complex and it's done in-place):

int foo(int x)
in {
    // ...
}
body {
    result = ...;
    auto y = computeSomething(result);
    assert(y ...);
    assert(y ...);
    return result;
}


I presume there are ways to disable the computation of y in release mode, but I
don't want to think about them. I just stick the y computation in the
postcondition and the compiler will take care of it.

Bye,
bearophile

Aug 11 2011

Don <nospam nospam.com> writes:

bearophile wrote:
 Don:
 
 2) I have found many situations where I am able to solve a problem with both a
simple and slow brute force solver, and a complex and fast algorithm to solve a
problem. The little program maybe is too much slow for normal usage, but it's
just few lines long (especially if I use lot of std.algorithm stuff) but it's
much less likely to contain bugs.


 
 Sorry, but personally I don't believe that this is useful outside of toy
examples.
 The question is, what bugs does it find that aren't found by a trivial unit
test?

 
 There are two cases:
 (1) it's a very tight test. In which case, it's essentially a unit test.
 or (2) it's a very loose test. In which case, it doesn't find bugs.

 
 Putting a simpler algorithm in the post-condition implements a third
possibility you are missing.
 
 Usually unit tests verify some specific cases (you are also able to add
generic testing code in the unit test, but this is just like moving the
postcondition elsewhere).
 
 If you put an alternative algorithm in the postcondition (under debug{} if you
want), you have some advantages:
 - It's tight, because the second algorithm is supposed to always give the same
results as the function.

 - It works with the real examples the program is run too, not just the cases
you have put in the unit test.

Conditions required for this to be true:
(1) the function must not be time critical;
(2) an alternative algorithm must exist;
(3) the alternative algorithm must be bug-free;
(4) the function must not have been tested properly;
(5) the faulty test cases must occur during debugging (they won't be 
caught during production);
(6) the programmer must remember to put the asserts in the 'out' 
contract, but not put them into the body of the function.

This doesn't leave much.


Sometimes you forget to add certain cases in the unittests. Putting the 
test in the postcondition makes sure it always run, for all the inputs 
your function is run on (unless you disable it), so you will catch the 
cases you didn't think of in the unittests.

 
 
 The crucial feature is, they do NOTHING except find bugs in the function
 they are attached to.

 
 In Eiffel you have the prestate too (the old), so the postcondition is the
only place where such information is usable. I hope prestate will be added to D
DbC, because it's a majob sub-feature of DbC. But I don't agree that
postconditions are useless in D.

??? Does that relate to my sentence in any way?

 For starters, it really needs to be a function with multiple return
 values. Otherwise, you can just stick asserts just before your return
 statement, and you don't need __old or any such thing.

 
 If a function has multiple return values the out(result) helps make sure all
the return paths are verified.

That's what I said.

 If the function has only one return value it helps anyway, because it helps
you not forget to verify the result.

???? Why would you remember to put an assert in the postcondition, when 
you didn't put it into the function?

 
 
 Under what circumstances are they are more valuable than any other assert
inside a function?

 
 I have already given some answers.

No you haven't.

 Another answer is this:
 
 
 int foo(int x)
 in {
     // ...
 }
 out(result) {
     auto y = computeSomething(result);
     assert(y ...);
     assert(y ...);
 }
 body {
     // ...
 }
 
 The out{} helps you organize your code, separating the tests of the body from
the postcondition tests. Also in the postcondition you are allowed to define
new variables and call things. All this out(){} code vanishes in release mode.
Ho do you do that with just asserts inside the body?
 
 
 If you do this the asserts will vanish in release mode, but the y will be
computed still, wasting computations (a smart compiler is able to see y is not
used and etc, but it's not sure this optimization happens if the computation of
y is complex and it's done in-place):
 
 int foo(int x)
 in {
     // ...
 }
 body {
     result = ...;
     auto y = computeSomething(result);
     assert(y ...);
     assert(y ...);
     return result;
 }
 
 
 I presume there are ways to disable the computation of y in release mode, but
I don't want to think about them. I just stick the y computation in the
postcondition and the compiler will take care of it.

Trivial!
Make the postcondition a nested function. (You can even make it a 
delegate literal, if it's only used in one place).

I'll explain my original statement further: If you have a theorem 
prover, then the theorem prover can use the 'out' contract in any 
function which calls that function.

Eg,
int square(int x) out { assert(result>=0); } body { return x*x; }

void foo()
{
    int q = square(-5);
    if (q < 0) { .... }

}
Theorem prover knows that q>=0, even if it doesn't have access to the 
body of 'square'. So it detects unreachable code in foo().

So in this case, the 'out' contract can be used to find bugs in code 
that the author of the contract didn't write.
Otherwise, out contracts only find bugs in the local function, which 
doesn't have much value, since unit testing already performs that role 
(and does it better).
By contrast, 'in' functions ALWAYS find external bugs rather than local 
ones, so they're an order of magnitude more valuable in the current 
implementation.

Aug 12 2011

Timon Gehr <timon.gehr gmx.ch> writes:

On 08/12/2011 01:31 PM, Don wrote:
 bearophile wrote:
 Don:

 2) I have found many situations where I am able to solve a problem
 with both a simple and slow brute force solver, and a complex and
 fast algorithm to solve a problem. The little program maybe is too
 much slow for normal usage, but it's just few lines long (especially
 if I use lot of std.algorithm stuff) but it's much less likely to
 contain bugs.


 Sorry, but personally I don't believe that this is useful outside of
 toy examples.
 The question is, what bugs does it find that aren't found by a
 trivial unit test?

 There are two cases:
 (1) it's a very tight test. In which case, it's essentially a unit test.
 or (2) it's a very loose test. In which case, it doesn't find bugs.

 Putting a simpler algorithm in the post-condition implements a third
 possibility you are missing.

 Usually unit tests verify some specific cases (you are also able to
 add generic testing code in the unit test, but this is just like
 moving the postcondition elsewhere).

 If you put an alternative algorithm in the postcondition (under
 debug{} if you want), you have some advantages:
 - It's tight, because the second algorithm is supposed to always give
 the same results as the function.

 - It works with the real examples the program is run too, not just the
 cases you have put in the unit test.

 Conditions required for this to be true:
 (1) the function must not be time critical;

If the difference is not an asymptotic one, it can well be time critical 
(then the debug version will just not be as responsive as would be 
desirable for a finished product, which is often the case anyways.)

 (2) an alternative algorithm must exist;

If an optimized version exists, a slower one exists too.

 (3) the alternative algorithm must be bug-free;

That is often trivial. Also, if it is buggy, the discrepancy will be 
caught by the contract and the bug can be fixed.

 (4) the function must not have been tested properly;

Usually, large software that has been 'tested properly' still contains 
bugs. For mission critical tasks, a form of testing related to this one 
is used heavily (multiple teams implement the same specification and the 
result of each query to the software is determined by majority vote).

 (5) the faulty test cases must occur during debugging (they won't be
 caught during production);

Sure. This can catch eg. regressions during development, If there is a 
large team of programmers involved, contracts are more useful than if 
there is only a single developer.

 (6) the programmer must remember to put the asserts in the 'out'
 contract, but not put them into the body of the function.

Well, if he they are a seasoned contract programmer, this is not a 
problem at all. ;)

 This doesn't leave much.

I disagree.

 Sometimes you forget to add certain cases in the unittests. Putting the
 test in the postcondition makes sure it always run, for all the inputs
 your function is run on (unless you disable it), so you will catch the
 cases you didn't think of in the unittests.

 The crucial feature is, they do NOTHING except find bugs in the function
 they are attached to.



They specify what the function is supposed to do, in a way that always 
is up to date because it gets checked.

 In Eiffel you have the prestate too (the old), so the postcondition is
 the only place where such information is usable. I hope prestate will
 be added to D DbC, because it's a majob sub-feature of DbC. But I
 don't agree that postconditions are useless in D.

 ??? Does that relate to my sentence in any way?

Yes. He says that once prestate is available, out contracts will be more 
useful. But he thinks they are already quite valuable without them.

 For starters, it really needs to be a function with multiple return
 values. Otherwise, you can just stick asserts just before your return
 statement, and you don't need __old or any such thing.

 If a function has multiple return values the out(result) helps make
 sure all the return paths are verified.

 That's what I said.

 If the function has only one return value it helps anyway, because it
 helps you not forget to verify the result.

 ???? Why would you remember to put an assert in the postcondition, when
 you didn't put it into the function?

Don wrote
 bearophile wrote:
 If a function has multiple return values the out(result) helps make
 sure all the return paths are verified.

 That's what I said.

Exactly that reason:

int foo(){
     // some code
     if(condition) return 37; // added after 2h of debugging
     // more code
     result=...;
     assert(condition(result));
     return result;
}

int foo()
out(result){assert(condition(result));}
body{
     //some code
     if(condition) return 37;
     // more code
     return ...;
}

it is both more convenient (you don't have to change your program logic) 
and less error-prone.

Furthermore, all other programmers on the project can immediately check 
the postcondition and rely on that it holds for the result of any call 
of foo, even if the compiler does not use the out contract for any 
theorem proving. They can even do that before the respective function is 
implemented correctly. Out contracts are particularly useful when they 
are written before the function has is implemented completely.


 Under what circumstances are they are more valuable than any other
 assert inside a function?

 I have already given some answers.

 No you haven't.

 Another answer is this:


 int foo(int x)
 in {
 // ...
 }
 out(result) {
 auto y = computeSomething(result);
 assert(y ...);
 assert(y ...);
 }
 body {
 // ...
 }

 The out{} helps you organize your code, separating the tests of the
 body from the postcondition tests. Also in the postcondition you are
 allowed to define new variables and call things. All this out(){} code
 vanishes in release mode. Ho do you do that with just asserts inside
 the body?


 If you do this the asserts will vanish in release mode, but the y will
 be computed still, wasting computations (a smart compiler is able to
 see y is not used and etc, but it's not sure this optimization happens
 if the computation of y is complex and it's done in-place):

 int foo(int x)
 in {
 // ...
 }
 body {
 result = ...;
 auto y = computeSomething(result);
 assert(y ...);
 assert(y ...);
 return result;
 }


 I presume there are ways to disable the computation of y in release
 mode, but I don't want to think about them. I just stick the y
 computation in the postcondition and the compiler will take care of it.

 Trivial!
 Make the postcondition a nested function. (You can even make it a
 delegate literal, if it's only used in one place).

Because everyone who is working on the project wants to check nested 
functions? Sure it works, but it is not the best way to implement 
contract programming. That is why D has language support that goes 
beyond that.

 I'll explain my original statement further: If you have a theorem
 prover, then the theorem prover can use the 'out' contract in any
 function which calls that function.

 Eg,
 int square(int x) out { assert(result>=0); } body { return x*x; }

 void foo()
 {
 int q = square(-5);
 if (q < 0) { .... }

 }
 Theorem prover knows that q>=0, even if it doesn't have access to the
 body of 'square'. So it detects unreachable code in foo().

Theorem prover detects bug in square.

 So in this case, the 'out' contract can be used to find bugs in code
 that the author of the contract didn't write.
 Otherwise, out contracts only find bugs in the local function, which
 doesn't have much value, since unit testing already performs that role
 (and does it better).

In this case, obviously all the unit tests tested square with an input 
that was less than 2^^16 in absolute value. Writing the postcondition 
sometimes also allows you to reflect properly on what the precondition 
should be. Also, it will be tested on possibly unexpected input.

 By contrast, 'in' functions ALWAYS find external bugs rather than local
 ones, so they're an order of magnitude more valuable in the current
 implementation.

Not always. Sometimes they find bugs in the specification or the in 
contract itself.

Contracts are not only an instrument of verification, but also one of 
specification.

http://en.wikipedia.org/wiki/Design_by_contract

The out contract is not there to verify some internal consistency 
conditions, but to specify what the function should compute, in an exact 
way, that is always up to date. The out contract is for programmers too, 
not only for compilers.

Contract programming is one of these Software Engineering things. :)

The crucial difference between out contract and an assert at the end of 
the function is how they are supposed to be used, not how they will work.

This is reflected by the fact that DMD *.di generation will keep the 
contracts around.


-Timon

Aug 12 2011

bearophile <bearophileHUGS lycos.com> writes:

Don:

 bearophile wrote:
 2) I have found many situations where I am able to solve a problem with both a
 simple and slow brute force solver, and a complex and fast algorithm to solve
 a problem. The little program maybe is too much slow for normal usage, but
 it's just few lines long (especially if I use lot of std.algorithm stuff)
 but it's much less likely to contain bugs.

 Sorry, but personally I don't believe that this is useful outside of toy
 examples.

This code of mine is a real-world example. This is a struct method with
comments removed, the postcondition contains both fast loose tests and a tight
slow O(n^2) version that thanks to std.algorithm is just 2 lines long
(unfortunately because of DMD bug 6417 it's a bit longer than 2 lines). It's
asymptotically slower than the fast algorithm, so I've put it into a debug{}.


void foo(in int[] p, int[] q) nothrow
in {
  assert(p.length == vectorLen);
  assert(q.length == vectorLen);
  assert(equal(p.dup.sort(), iota(1, vectorLen+1)));
} out {
  foreach (i, qi; q)
    assert(qi >= 0 && qi < (vectorLen - i));
  debug foreach (j; 1 .. (q.length + 1))
    assert(q[j-1] == count!((int k){ return p[k] > j;
})(iota(countUntil(cast()p, j) + 1)));
} body {
  op[0] = &items[0];
  foreach (i, pi; p) {
    items[i] = Item(pi, 0);
    op[i + 1] = &items[i + 1];
  }

  foreach_reverse (k; 0 .. (lim + 1)) {
    xs[0 .. ((vectorLen >> (k + 1)) + 1)] = 0;
    foreach (j; 0 .. vectorLen) {
      int r = (op[j].space >> k) % 2;
      int s = op[j].space >> (k + 1);
      if (r)
        xs[s]++;
      else
        op[j].digit += xs[s];
    }
  }

  foreach (i; 0 .. vectorLen)
    q[op[i].space - 1] = op[i].digit;
}


This postcondition has caught a simple mistake I've put in the fast algorithm.
Probably there are ways to catch the same bug with unittests too.

The ugly empty cast() inside the postcondition is another workaround, because
countUntil doesn't work with a const p.

If you write those two postcondition lines in Python3 it becomes less noisy:

assert q == [sum(p[k] > j for k in range(p.index(j) + 1)) for j in range(1,
len(q)+1)]

Instead of:

foreach (j; 1 .. q.length+1)
    assert(q[j-1] == count!((int k){ return p[k] > j;
})(iota(countUntil(cast()p, j) + 1)));


Here using assert(equal(q, map!...)) becomes too much puzzle-code. It's already
too much nested.

If you program in functional-style it's hard to write lines of 70 chars. In
Haskell too lines of code are often long.

Bye,
bearophile

Aug 12 2011

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 11.08.2011 8:58, Don wrote:
 bearophile wrote:
 Contracts don't replace unittests, they complement each other.


 They are nice but have little value over plain assert _unless_ we 
 are talking about classes and _inheritance_, which isn't the case here.

 It's easy to forget to test the output of a function, the "out" 
 contracts help here. In structs the invariant helps you avoid 
 forgetting to call manually a sanity test function every time you 
 come in and out of a method.

 You're conflating a couple of things here. Invariants are tremendously 
 helpful for structs as well as classes.

I stand corrected about invariants, somehow I wasn't considering them a 
part of contracts.

 "out" contracts seem to be almost useless, unless you have a theorem 
 prover. The reason is, that they test nothing apart from the function 
 they are attached to, and it's much better to do that with unittesting.
 They have very little in common with 'in' contracts.

 I think that EVERY struct and class in Phobos should have an invariant 
 (except for something like Complex, where there are no invalid values).
 But I don't think 'out' contracts would add much value at all.


-- 
Dmitry Olshansky

Aug 11 2011

Jacob Carlborg <doob me.com> writes:

On 2011-08-10 18:02, bearophile wrote:
 Dmitry Olshansky:

 To get a small no-crap-included beta package see download section of
 https://github.com/blackwhale/FReD for .7zs.

 When you write some English text you don't write a single block of text, you
organize it into paragraphs, and paragraphs into chapters, chapters into
sections, sections into books, etc. Time ago I have understood that paragraphs
are very good in source code too.

 So I suggest you to add a blank line here and there inside your functions to
separate them into paragraphs. I can't give you a style rule, you will need to
create your own style, but often a function that's more than 10 lines line long
needs one or more blank lines inside (some people say that every time you see
one of such paragraphs in a function, especially if it has a comment before it,
then you need to perform an "extract method" to improve the code. I believe
this is a bad advice).

I always add a blank line before and after statements.

-- 
/Jacob Carlborg

Aug 10 2011

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 10.08.2011 14:42, Dmitry Olshansky wrote:
     In case I failed to mention it before, I m working on the project 
 codenamed FReD that is aimed at ~100%* source level compatible 
 overhaul of std.regex, that uses better implementation techniques, 
 provides modern Unicode support and common syntax riches.

     I think it's time for a public beta release,  since it _should_ be 
 ready for mainstream usage. There are some rough edges, and a couple 
 issues that I'm aware of but they are nowhere in realistic use cases.

     In order to avoid unexpected regressions I'd be glad if current 
 std.regex users do try it for their projects/tests.
 To get a small no-crap-included beta package see download section of 
 https://github.com/blackwhale/FReD for .7zs.
 I'll upload newer packages as bugs get exposed and fixed. 
 Alternatively, if you a comfortable with git you may just git clone 
 entire repo. Some helpful notes (same as README) can be found here : 
 https://github.com/blackwhale/FReD/wiki/Beta-release

 Caveats:
     In order for it compile a tiny change to 2.054 source is needed 
 (no need to recompile Phobos! it's only in templates):
 patch std.algorithm.cmp according to this diff 
 https://github.com/D-Programming-Language/phobos/pull/176/files#L0L4631
<https://github.com/D-Programming-Language/phobos/pull/176/files#L0L4633> 

 and to get CTFE features working add if(!__ctfe) listed in the next 
 diff on the same webpage.
 (this is already upstream, so if you're using a fork of phobos just 
 pull this in)

 * some API problems might lead to a breaking change, though it didn't 
 happen in this release

Meanwhile the new beta is up:
https://github.com/downloads/blackwhale/FReD/FReD_beta1.7z
or checkout "stable" branch https://github.com/blackwhale/FReD/tree/stable
( as dawgfoto noticed  the master branch tend to break on 64-bit as I 
develop primarily on 32bit)

With prominent changes being:
- fixed a horrible memory corruption with regex having certain 
groups/backrefs  in lookaround

- no GC heap activity during matching in all engines, except as 
workaround for bug http://d.puremagic.com/issues/show_bug.cgi?id=6199

- new prefix searcher, featuring up to 40x search speed up on patterns 
with semi-fixed prefixes e.g. \b(https?|ftp|file)://\S+  and  
([0-9][0-9]?)/([0-9][0-9]?)/([0-9][0-9]([0-9][0-9])?)

- bool opCast for RegexMatch for nice "test if not empty syntax" as 
suggested by Steven

- lots of small fixes and optimizations

-- 
Dmitry Olshansky

Aug 16 2011

bearophile <bearophileHUGS lycos.com> writes:

Dmitry Olshansky:

 To get a small no-crap-included beta package see download section of 
 https://github.com/blackwhale/FReD for .7zs.


I have not patched DMD, but it gives me some problem here:

void parseFlags(S)(S flags)
{
    foreach(ch; flags)//flags are ASCII anyway
    {
        switch(ch)
        {

            foreach(i, op; __traits(allMembers, RegexOption))
            {
                case RegexOptionNames[i]:
                        if(re_flags & mixin("RegexOption."~op))
                            throw new RegexException(text("redundant flag
specified: ",ch));
                        re_flags |= mixin("RegexOption."~op);
                        break;
            }
            default:
                if(__ctfe)
                   assert(text("unknown regex flag '",ch,"'"));
                else
                    new RegexException(text("unknown regex flag '",ch,"'"));
        }


To better see the situation I have written a small test case:

import std.typetuple: TypeTuple;

enum RegexOption : uint { A, B, C } // no need to put a semicolon here

alias TypeTuple!(RegexOption.A, RegexOption.B, RegexOption.C) RegexOptionNames;

void main() {
    RegexOption ch;

    switch (ch) {
        foreach (i, op; __traits(allMembers, RegexOption))
            case RegexOptionNames[i]: break;

        default: assert(0);
    }
}


test.d(12): Error: switch case fallthrough - use 'goto case;' if intended
test.d(12): Error: switch case fallthrough - use 'goto case;' if intended
test.d(12): Error: switch case fallthrough - use 'goto case;' if intended
test.d(14): Error: switch case fallthrough - use 'goto default;' if intended

This used to work, I think. The new DMD switch analysis seems to have a bug.

-------------

If you want a benchmark, to compare it with other implementations, there is
this one:
http://shootout.alioth.debian.org/debian/program.php?test=regexdna&lang=gdc&id=4

Bye,
bearophile

Aug 16 2011

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 17.08.2011 3:47, bearophile wrote:
 Dmitry Olshansky:

 To get a small no-crap-included beta package see download section of
 https://github.com/blackwhale/FReD for .7zs.


 I have not patched DMD, but it gives me some problem here:

 void parseFlags(S)(S flags)
 {
      foreach(ch; flags)//flags are ASCII anyway
      {
          switch(ch)
          {

              foreach(i, op; __traits(allMembers, RegexOption))
              {
                  case RegexOptionNames[i]:
                          if(re_flags&  mixin("RegexOption."~op))
                              throw new RegexException(text("redundant flag
specified: ",ch));
                          re_flags |= mixin("RegexOption."~op);
                          break;
              }
              default:
                  if(__ctfe)
                     assert(text("unknown regex flag '",ch,"'"));
                  else
                      new RegexException(text("unknown regex flag '",ch,"'"));
          }


 To better see the situation I have written a small test case:

 import std.typetuple: TypeTuple;

 enum RegexOption : uint { A, B, C } // no need to put a semicolon here

 alias TypeTuple!(RegexOption.A, RegexOption.B, RegexOption.C) RegexOptionNames;

 void main() {
      RegexOption ch;

      switch (ch) {
          foreach (i, op; __traits(allMembers, RegexOption))
              case RegexOptionNames[i]: break;

          default: assert(0);
      }
 }


 test.d(12): Error: switch case fallthrough - use 'goto case;' if intended
 test.d(12): Error: switch case fallthrough - use 'goto case;' if intended
 test.d(12): Error: switch case fallthrough - use 'goto case;' if intended
 test.d(14): Error: switch case fallthrough - use 'goto default;' if intended

 This used to work, I think. The new DMD switch analysis seems to have a bug.

 -------------

Yes, that's a bug. But it's not a regression, I assume you started to 
compile with -w, that's when it happens IIRC. I almost forgot about it, 
thanks for uncovering it again, you may as well file it.

 If you want a benchmark, to compare it with other implementations, there is
this one:
 http://shootout.alioth.debian.org/debian/program.php?test=regexdna&lang=gdc&id=4

All in due time, though this one involves semi-fixed patterns, hm ... 
very promising.

-- 
Dmitry Olshansky

Aug 17 2011

bearophile <bearophileHUGS lycos.com> writes:

Dmitry Olshansky:

 Yes, that's a bug. But it's not a regression,

I think it's a DMD regression, probably introduced with the recent changes in
switch semantics. DMD 2.042 doesn't have this bug.


 I assume you started to compile with -w,

I suggest Phobos devs to use -w too.


 thanks for uncovering it again, you may as well file it.

OK, I'll add it to Bugzilla.

Bye,
bearophile

Aug 17 2011

bearophile <bearophileHUGS lycos.com> writes:

 thanks for uncovering it again, you may as well file it.

 
 OK, I'll add it to Bugzilla.

http://d.puremagic.com/issues/show_bug.cgi?id=6518

Aug 17 2011

amanda <maalice19 yahoo.com> writes:

When you have Herpes, HIV/AIDS, hpv,or any other STD, it can feel like you are
all alone in the world  DatingHerpesSingles.com is a place where you didn't
have to worry about being rejected   Just feel free to chat, share stories,
make friends in your local area.

Aug 16 2011

D Programming

C/C++ Programming

Other

digitalmars.D - [GSOC] regular expressions beta is here