www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - An old wart - opApply's return type is an int

reply "Matthew" <admin stlsoft.dot.dot.dot.dot.org> writes:
No new arguments here, just an expanded, and largely new, audience. The 
old arguments were perfectly good anyway. I'll reiterate then now.

Consider std.openrj.Record.opApply():

  int opApply(int delegate(in char[] name, in char[] value) dg)
  {
    int result  = 0;
    foreach(Field field; m_fields)
    {
      result = dg(field.name(), field.value());
      if(0 != result)
      {
        break;
      }
    }
    return result;
  }


The result method is the vector by which the compiler generated delegate 
'dg' communicates with opApply() and with the compiler translated code 
in the foreach statement.

The rule is that the implementor of opApply() either returns 0, or 
returns the result of the delegate. NOTHING ELSE IS ALLOWED. (And if you 
do _anything else_, you're likely to find your foreach client code doing 
all manner of weird things, as though it'd been told to 
break/goto/continue/next/<whatever>.)

There are four points where result is manipulated in this very basic 
opApply(). (FYI: it's possible to have much more complex ones.)

1
    int result  = 0;

2.
      result = dg(field.name(), field.value());

3.
      if(0 != result)
      {
        break;
      }

4.
    return result;

Now D claims to 'make it hard for programmers to make mistakes', or some 
such. I can assign any value to an int, and yet the semantics of 
opApply() require that I only ever assign 0 or the result of calling the 
delegate. This is a contradiction.

It would be a trivial matter to fluff any of those four points in the 
function. Imagine if you're doing complex things and using 'res' for 
your processing and 'ret' for your return value. Nasty.

Developers should be diligent, to be sure, but we're all human.

The solution, as I proposed a long long time ago, the solution to this 
is to define opApply() as having a return type of type 
OPAPPLY_RETURN_VALUE (or OpApplyReturn, or whatever). This would be an 
enum with *one* element:

    enum OPAPPLY_RETURN_VALUE
    {
        COMPLETE    =    0
    }

Because there's only one value - that we know about, anyway - it'd not 
be too difficult to make sure it was initialised correctly. The compiler 
would notionally cast things to and from this enum type in the foreach 
and delegates, but we'd never have to know or care about this. 
Everything would work exactly as now; the object code would likely 
contain exactly the same bytes as it does now. And there'd be no way to 
unwittingly screw with foreach/opApply().

Since this costs developers nothing, and Walter only some small and 
entirely one-off effort, and it guarantees that foreach/opApply() 
stuff-ups can only occur as a result of determined malicious effort, 
rather than all-too-easy mistake, I ask if anyone would put forth any 
arguments why it should not be adopted. D is, after all, making all 
kinds of claims to be helping developers write error free code ...

Cheers

Matthew
Mar 07 2005
parent reply Kris <Kris_member pathlink.com> writes:
Presumably the delegate return-type should equate with the enum also? Otherwise,
one would have to explicitly cast the assignment to 'result'.

From what I recall, Walter didn't comment upon why he apparently prefers the
open-ended 'int' instead ... I'll go out on a limb and hazard that he /might/
prefer the obfuscation of the delegate return; i.e. if all the various
'commands' were to be spelt out in an enum declaration, some darned developer
might take "advantage" of that. Or, similarly, perhaps the values are
compiler-specific?

That's conjecture, of course. But it's the best I can come up with against the
notion.

I'm certainly a fan of tightening up valid ranges, so would also like to hear
why this remains the way that it is ...




In article <d0jd0v$1c5e$1 digitaldaemon.com>, Matthew says...
No new arguments here, just an expanded, and largely new, audience. The 
old arguments were perfectly good anyway. I'll reiterate then now.

Consider std.openrj.Record.opApply():

  int opApply(int delegate(in char[] name, in char[] value) dg)
  {
    int result  = 0;
    foreach(Field field; m_fields)
    {
      result = dg(field.name(), field.value());
      if(0 != result)
      {
        break;
      }
    }
    return result;
  }


The result method is the vector by which the compiler generated delegate 
'dg' communicates with opApply() and with the compiler translated code 
in the foreach statement.

The rule is that the implementor of opApply() either returns 0, or 
returns the result of the delegate. NOTHING ELSE IS ALLOWED. (And if you 
do _anything else_, you're likely to find your foreach client code doing 
all manner of weird things, as though it'd been told to 
break/goto/continue/next/<whatever>.)

There are four points where result is manipulated in this very basic 
opApply(). (FYI: it's possible to have much more complex ones.)

1
    int result  = 0;

2.
      result = dg(field.name(), field.value());

3.
      if(0 != result)
      {
        break;
      }

4.
    return result;

Now D claims to 'make it hard for programmers to make mistakes', or some 
such. I can assign any value to an int, and yet the semantics of 
opApply() require that I only ever assign 0 or the result of calling the 
delegate. This is a contradiction.

It would be a trivial matter to fluff any of those four points in the 
function. Imagine if you're doing complex things and using 'res' for 
your processing and 'ret' for your return value. Nasty.

Developers should be diligent, to be sure, but we're all human.

The solution, as I proposed a long long time ago, the solution to this 
is to define opApply() as having a return type of type 
OPAPPLY_RETURN_VALUE (or OpApplyReturn, or whatever). This would be an 
enum with *one* element:

    enum OPAPPLY_RETURN_VALUE
    {
        COMPLETE    =    0
    }

Because there's only one value - that we know about, anyway - it'd not 
be too difficult to make sure it was initialised correctly. The compiler 
would notionally cast things to and from this enum type in the foreach 
and delegates, but we'd never have to know or care about this. 
Everything would work exactly as now; the object code would likely 
contain exactly the same bytes as it does now. And there'd be no way to 
unwittingly screw with foreach/opApply().

Since this costs developers nothing, and Walter only some small and 
entirely one-off effort, and it guarantees that foreach/opApply() 
stuff-ups can only occur as a result of determined malicious effort, 
rather than all-too-easy mistake, I ask if anyone would put forth any 
arguments why it should not be adopted. D is, after all, making all 
kinds of claims to be helping developers write error free code ...

Cheers

Matthew
Mar 07 2005
parent "Matthew" <admin stlsoft.dot.dot.dot.dot.org> writes:
"Kris" <Kris_member pathlink.com> wrote in message 
news:d0jhf6$1geo$1 digitaldaemon.com...
 Presumably the delegate return-type should equate with the enum also? 
 Otherwise,
 one would have to explicitly cast the assignment to 'result'.

 From what I recall, Walter didn't comment upon why he apparently 
 prefers the
 open-ended 'int' instead ... I'll go out on a limb and hazard that he 
 /might/
 prefer the obfuscation of the delegate return; i.e. if all the various
 'commands' were to be spelt out in an enum declaration, some darned 
 developer
 might take "advantage" of that. Or, similarly, perhaps the values are
 compiler-specific?
But I haven't said that they'd be visible. Indeed, I specifically said that only the 0 value would be visible. All other values would be 'known' _only_ to the compiler (implementor).
 That's conjecture, of course. But it's the best I can come up with 
 against the
 notion.
Since I haven't said that, I assume that you agree with me that there is no (good) argument against.
 I'm certainly a fan of tightening up valid ranges, so would also like 
 to hear
 why this remains the way that it is ...
Walter???
 In article <d0jd0v$1c5e$1 digitaldaemon.com>, Matthew says...
No new arguments here, just an expanded, and largely new, audience. 
The
old arguments were perfectly good anyway. I'll reiterate then now.

Consider std.openrj.Record.opApply():

  int opApply(int delegate(in char[] name, in char[] value) dg)
  {
    int result  = 0;
    foreach(Field field; m_fields)
    {
      result = dg(field.name(), field.value());
      if(0 != result)
      {
        break;
      }
    }
    return result;
  }


The result method is the vector by which the compiler generated 
delegate
'dg' communicates with opApply() and with the compiler translated code
in the foreach statement.

The rule is that the implementor of opApply() either returns 0, or
returns the result of the delegate. NOTHING ELSE IS ALLOWED. (And if 
you
do _anything else_, you're likely to find your foreach client code 
doing
all manner of weird things, as though it'd been told to
break/goto/continue/next/<whatever>.)

There are four points where result is manipulated in this very basic
opApply(). (FYI: it's possible to have much more complex ones.)

1
    int result  = 0;

2.
      result = dg(field.name(), field.value());

3.
      if(0 != result)
      {
        break;
      }

4.
    return result;

Now D claims to 'make it hard for programmers to make mistakes', or 
some
such. I can assign any value to an int, and yet the semantics of
opApply() require that I only ever assign 0 or the result of calling 
the
delegate. This is a contradiction.

It would be a trivial matter to fluff any of those four points in the
function. Imagine if you're doing complex things and using 'res' for
your processing and 'ret' for your return value. Nasty.

Developers should be diligent, to be sure, but we're all human.

The solution, as I proposed a long long time ago, the solution to this
is to define opApply() as having a return type of type
OPAPPLY_RETURN_VALUE (or OpApplyReturn, or whatever). This would be an
enum with *one* element:

    enum OPAPPLY_RETURN_VALUE
    {
        COMPLETE    =    0
    }

Because there's only one value - that we know about, anyway - it'd not
be too difficult to make sure it was initialised correctly. The 
compiler
would notionally cast things to and from this enum type in the foreach
and delegates, but we'd never have to know or care about this.
Everything would work exactly as now; the object code would likely
contain exactly the same bytes as it does now. And there'd be no way 
to
unwittingly screw with foreach/opApply().

Since this costs developers nothing, and Walter only some small and
entirely one-off effort, and it guarantees that foreach/opApply()
stuff-ups can only occur as a result of determined malicious effort,
rather than all-too-easy mistake, I ask if anyone would put forth any
arguments why it should not be adopted. D is, after all, making all
kinds of claims to be helping developers write error free code ...

Cheers

Matthew
Mar 07 2005