www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - regexex, enforce and purity

reply "monarch_dodra" <monarchdodra gmail.com> writes:
Given this little program testing regexs, I decided to replace 
one of the example's assert with an enforce:

--------
import std.regex;
import std.exception;
void main()
{
     auto m = match("hello world", regex("world"));
     assert(m);
     enforce(m); // <-- HERE
     enforce(cast(bool)m);
     enforce(!m.empty);
}
--------

I get the compile errors:
src\phobos\std\exception.d(356): Error: pure function 'enforce' 
cannot call impure function '~this'
src\phobos\std\exception.d(358): Error: pure function 'enforce' 
cannot call impure function 'opCast'

While I understand the problem at play, I have a few doubts:
1)Why the difference between assert and enforce? Shouldn't both 
have the same restraints?
2)What exactly does purity mean for a *member* function?
3)And shouldn't RegexMatch's .opCast (and .empty) should be 
qualified as pure?

...

I did some digging while typing, and was about to suggest that 
the problem could be solved if enforce was required to take a 
boolean as an argument (makes sense), forcing the cast *outside* 
of the enforce. However, it would appear that enforce returns its 
value, the goal (probably) being to make this legal:

auto bar = enforce(foo());

The return value is enforced and passed to bar in a 1-liner.

BUT... assert doesn't do that. THAT is the original source of the 
difference in behavior.

So I'll rephrase my 1):
Why the difference in behavior regarding the return value? Is it 
just historical/no real reason, or is there something for me to 
learn here?
Sep 09 2012
next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday, September 09, 2012 21:04:58 monarch_dodra wrote:
 So I'll rephrase my 1):
 Why the difference in behavior regarding the return value? Is it
 just historical/no real reason, or is there something for me to
 learn here?

Just look at the examples. You're supposed to be able to use enforce to enforce that the value is explicitly convertible to true and then use the value all in one expression. assert gets compiled out in -release mode, so that sort of behavior doesn't make sense for assert. enforce on the other hand is _always_ there, so using it as part of a larger expression can make sense. I don't think that it gets used that way very often though, because it only makes sense when testing whether the result is convertible to true, and most expressions used with enforce or already boolean expressions (in which case, using the value aftewards usually doesn't make sense). The problem here is that you seem to have found a way to use an impure function inside of enforce. safe and pure were added onto it with the idea that it was impossible for it to be otherwise. T enforce(T)(T value, lazy const(char)[] msg = null, string file = __FILE__, size_t line = __LINE__) safe pure { if (!value) bailOut(file, line, msg); return value; } It's not calling any functions on value, so in theory, it shouldn't require any functions on T to be either safe or pure. But it looks like the destructor is being called (which I guess isn't surprising if you really think about it), and since an explicit cast is happening (in the if, !value becomes !cast(bool)value), the cast operator would probably need to be safe and pure as well. It's a bug in enforce. Since it apparently _is_ possible for non- safe and non-pure functions to be called within enforce, it should use attribute inferrence for them rather than explicitly listing them. Please report it. - Jonathan M Davis
Sep 09 2012
prev sibling next sibling parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 09-Sep-12 23:04, monarch_dodra wrote:
 Given this little program testing regexs, I decided to replace one of
 the example's assert with an enforce:

 --------
 import std.regex;
 import std.exception;
 void main()
 {
      auto m = match("hello world", regex("world"));
      assert(m);
      enforce(m); // <-- HERE
      enforce(cast(bool)m);
      enforce(!m.empty);
 }
 --------

 I get the compile errors:
 src\phobos\std\exception.d(356): Error: pure function 'enforce' cannot
 call impure function '~this'
 src\phobos\std\exception.d(358): Error: pure function 'enforce' cannot
 call impure function 'opCast'

 While I understand the problem at play, I have a few doubts:
 1)Why the difference between assert and enforce? Shouldn't both have the
 same restraints?

Nope, assert is built-in and thus is enigma ;)
 2)What exactly does purity mean for a *member* function?

similar to the usual free function.
 3)And shouldn't RegexMatch's .opCast (and .empty) should be qualified as
 pure?

I've no idea. Pure/nothrow zealots might have made enforce pure but it really shouldn't always be. In fact it should be template and rely on deduction, if it is already then it's the deduction that is broken. Also I see that enforce tries to copy RegexMatch object, this involves destructor and that's can't be pure - it's ref-counted entity around C-heap memory chunk. Last time I checked destructor & postblits were mostly broken w.r.t. pure/safe/immutable etc.
 ...

 I did some digging while typing, and was about to suggest that the
 problem could be solved if enforce was required to take a boolean as an
 argument (makes sense), forcing the cast *outside* of the enforce.
 However, it would appear that enforce returns its value, the goal
 (probably) being to make this legal:

It's a case of some tricky and cool stuff that sometimes isn't. The idea of enforce was that passed in object is tested with if using whatever implicit conversion possible and returns it as is if it passes the 'if test'. The trick is that it enables some convenient things: auto f = enforce(fopen("blah", "r")); Now in your case probably this will work better: m = enforce(move(m)); as it technically shouldn't call m's destructor. enforce is specifically geared toward r-values & RVO optimization.
 auto bar = enforce(foo());

 The return value is enforced and passed to bar in a 1-liner.

 BUT... assert doesn't do that. THAT is the original source of the
 difference in behavior.

They are different on so many levels and do serve different needs. More then that it's library artifact vs built-in statement.
 So I'll rephrase my 1):
 Why the difference in behavior regarding the return value? Is it just
 historical/no real reason, or is there something for me to learn here?

Aside from the fact that assert shouldn't affect control flow in any way, thus: m = assert(m); //wouldn't make any sense as assert gets stripped in release builds While enforce is a convenient way to check some inputs and possible states and throw if they are not good. -- Dmitry Olshansky
Sep 09 2012
prev sibling parent "monarch_dodra" <monarchdodra gmail.com> writes:
Thank you both for your replies, they make perfect sense.

I filed a bug report here:
http://d.puremagic.com/issues/show_bug.cgi?id=8637
Sep 10 2012