www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 4290] New: 'Fragile' opCmp/toHash signature errors

reply d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4290

           Summary: 'Fragile' opCmp/toHash signature errors
           Product: D
           Version: future
          Platform: All
        OS/Version: All
            Status: NEW
          Keywords: diagnostic
          Severity: enhancement
          Priority: P2
         Component: DMD
        AssignedTo: nobody puremagic.com
        ReportedBy: bearophile_hugs eml.cc


--- Comment #0 from bearophile_hugs eml.cc 2010-06-06 13:08:45 PDT ---
This is a reduced version of a bug in my code. This D2 code compiles, but at
the end there are two instances of Foo inside the AA.


struct Foo {
    int x;
    const bool opEquals(ref const Foo f) {
        return true;
    }
    int opCmp(ref const Foo f) {
        return 0;
    }
    hash_t toHash() {
        return 10;
    }
}
void main() {
    int[Foo] aa;
    aa[Foo(1)]++;
    aa[Foo(2)]++;
    assert(aa.length == 1); // asserts, it's equal to 2
}


The problem is caused just by two missing 'const', the following code is
correct:


struct Foo {
    int x;
    const bool opEquals(ref const Foo f) {
        return true;
    }
    int opCmp(ref const Foo f) const {
        return 0;
    }
    hash_t toHash() const {
        return 10;
    }
}
void main() {
    int[Foo] aa;
    aa[Foo(1)]++;
    aa[Foo(2)]++;
    assert(aa.length == 1); // OK
}


In my opinion the D2 has to catch such programming errors and refuse opCmp and
toHash with a wrong signature, to avoid similar common bugs.

See also bug 3844

-----------------------------

An alternative solution is to invent something similar to the 'override'
keyword for struct member functions:

struct Foo {
    int x;
    const bool opEquals(ref const Foo f) {
        return true;
    }
    static_override int opCmp(ref const Foo f) { // error, doesn't override the
default opCmp
        return 0;
    }
    static_override hash_t toHash() { // error, doesn't override the default
toHash
        return 10;
    }
}
void main() {
    int[Foo] aa;
    aa[Foo(1)]++;
    aa[Foo(2)]++;
    assert(aa.length == 1); // asserts, it's equal to 2
}


But D2 structs don't support inheritance, so static_override is of limited
usefulness, it looks over-engineering.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jun 06 2010
next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4290


Kevin Bealer <kevinbealer gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |kevinbealer gmail.com


--- Comment #1 from Kevin Bealer <kevinbealer gmail.com> 2010-09-15 21:26:18
PDT ---
Just ran into this as well; also, note that the page here gives the wrong
version of the signature; it omits the critical const, so anyone running into
this problem will be misguided.

http://www.digitalmars.com/d/2.0/operatoroverloading.html#compare

Also... it looks like some of the standard code such as in std.bitarray does
this without the "const", so I imagine the standard library is broken, though I
haven't tested whether bitarray's sort properly.

I agree with bearophile's recommendation -- there is no value in accepting a
signature for "standard" interfaces like opCmp that will not work.

One other detail: both of the opCmp signatures seem to work when "<" is used,
but for some reason, not with .sort. ??

I'm testing with 2.048.

Thanks,
Kevin Bealer

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Sep 15 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4290


Steven Schveighoffer <schveiguy yahoo.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |schveiguy yahoo.com


--- Comment #2 from Steven Schveighoffer <schveiguy yahoo.com> 2010-09-16
06:10:07 PDT ---
Some more info about how the runtime deals with struct "polymorphism" and opCmp
and toHash etc.

In TypeInfo_Struct, there are the following function pointers:

http://www.dsource.org/projects/druntime/browser/trunk/src/object_.d?rev=376#L930

The compiler sets these if you define the appropriate functions with the
correct signatures.

Then the runtime uses the TypeInfo.compare function to compare two values, or
TypeInfo.getHash.  The TypeInfo_Struct overrides these functions to check if
those function pointers are valid, and if so, call them to compare two values. 
Otherwise, it does a bit-compare.

Any runtime function that uses these functions (including array sort) will be
affected.

I disagree that the compiler should reject opCmp if it doesn't match the
special signature, opCmp is simply a function.  There are valid uses of opCmp
that do not match the signature.  For example, the compiler currently requires
struct opEquals to have a signature of

bool opEquals(ref const(T) other) const

Which means you can't compare two rvalues (bug already filed).  This is the
mess that will occur if you start requiring special signatures for all the
others.

However, I think it would be good to add an attribute indicating that the
compiler *should* reject the function if it doesn't match, as bearophile
suggests.  However, I think it should be an  attribute, not a keyword.  My
suggestion is  typeinfo to signify that this function belongs in the typeinfo.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Sep 16 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4290



--- Comment #3 from bearophile_hugs eml.cc 2010-09-16 17:21:12 PDT ---
Answer to comment2:

I understand what you say only partially. In my opinion silently ignoring the
opCmp() and toHash() like in my first example is not acceptable for the D
language (that was a bug in my code).

The general design of D is to perform safe things on default, and unsafe on
request. And here the safer thing is to perform the conformance tests on
default and disable them on request. I agree that an attribute is better here.
So a  notypeinfo seems better, if it's implementable and useful.

I don't like the  typeinfo/ notypeinfo names a lot because they look too much
tied to the underlying implementation, instead to their true semantic
meaning/purpose.

In LDC there is pragma(no_typeinfo):
http://www.dsource.org/projects/ldc/wiki/Docs#no_typeinfo
So I think it's better to find a different name for the attribute, different
from  notypeinfo.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Sep 16 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4290



--- Comment #4 from Steven Schveighoffer <schveiguy yahoo.com> 2010-09-17
09:18:52 PDT ---
(In reply to comment #3)
 Answer to comment2:
 
 I understand what you say only partially. In my opinion silently ignoring the
 opCmp() and toHash() like in my first example is not acceptable for the D
 language (that was a bug in my code).
But those are valid functions. Why should the D compiler refuse to compile them? See my point about opEquals. This is probably a job for a lint-type tool. To have the compiler assume what you are thinking can be just as bad, in fact even worse, than blindly obeying your instructions.
 The general design of D is to perform safe things on default, and unsafe on
 request. And here the safer thing is to perform the conformance tests on
 default and disable them on request. I agree that an attribute is better here.
 So a  notypeinfo seems better, if it's implementable and useful.
This has nothing to do with memory safety, which is what safe D is all about. opCmp and friends are first of all functions, with all the flexibility that functions enjoy. These kinds of problems occur in many places where duck typing is used. For example, you can mis-define some range by doing something like popfront(). The compiler won't refuse to compile this, and shouldn't. But it will refuse to allow it to be passed into a range-accepting function. Where do we draw the line? Note that opCmp can legitimately take all kinds of arguments, and legitimately not be const. One possible solution for this is to get rid of the runtime-properties all together, and just use templates/duck typing fully. Then you get a compile-time error when you define it improperly and use it in something like an AA or sort. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Sep 17 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4290



--- Comment #5 from bearophile_hugs eml.cc 2010-09-20 05:25:27 PDT ---
 Why should the D compiler refuse to compile them?
The D compiler has to disallow or warn against common traps.
 This is probably a job for a lint-type tool.
If some bug may be found at compile-time and it's not too much hard to find it (example: it doesn't require flow analysis), then in my opinion it's a job of the (D) compiler to avoid the bug. Clang compiler designers seem to agree with me.
 This has nothing to do with memory safety, which is what safe D is all about. 
I was not talking about memory safety or SafeD, I was talking about code safety in general. Generally D needs to be designed to avoid common bugs, when possible (another example of bug that I'd like D to avoid is to accept strings like "<<>" as operators for the operator overloading string template).
 For example, you can mis-define some range by doing something
 like popfront().  The compiler won't refuse to compile this, and shouldn't. 
 But it will refuse to allow it to be passed into a range-accepting function. 
 Where do we draw the line?
That's a nice example. I don't have a generic answer for your question, but to let this function be accepted and silently ignored: int opCmp(ref const Foo f) { because this is the only correct one: int opCmp(ref const Foo f) const { is too much bug-prone, in my opinion. This is beyond the line for me. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Sep 20 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4290



--- Comment #6 from Steven Schveighoffer <schveiguy yahoo.com> 2010-09-20
06:03:43 PDT ---
(In reply to comment #5)
 Why should the D compiler refuse to compile them?
The D compiler has to disallow or warn against common traps.
But opCmp is a valid function! I can still call it, even though it's not put into the typeinfo.
 This is probably a job for a lint-type tool.
If some bug may be found at compile-time and it's not too much hard to find it (example: it doesn't require flow analysis), then in my opinion it's a job of the (D) compiler to avoid the bug. Clang compiler designers seem to agree with me.
But it's not a bug! opCmp is a valid symbol name for a function, and can be called with the arguments you specify. In fact, it even works with the comparison operators. There is nothing inherently special about opCmp. We need to add special notation conveying our intentions to the compiler (that it should be used in the TypeInfo). It's like refusing to compile a function because it's not an english word. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Sep 20 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4290



--- Comment #7 from bearophile_hugs eml.cc 2010-09-20 16:46:25 PDT ---
It seems we are in irreducible disagreement here.

 But opCmp is a valid function!  I can still call it, even though it's not put
into the typeinfo. Of course. But when it is well formed it is also a *special* function, because the compiler uses it automatically in some useful situations.
 But it's not a bug!
If you define a Foo struct like this, it' quite probably a bug in your code: struct Foo { int x; const bool opEquals(ref const Foo f) { return true; } int opCmp(ref const Foo f) { return 0; } hash_t toHash() { return 10; } }
 There is nothing inherently special about opCmp.
It's special because there is a contract between the compiler and the programmer, that such functions are used for example when you sort an array of structs or you put them in an associative array. If the compiler in some situations silently doesn't use such functions when the programmer surely intended them to be used that way by the compiler, then there's something wrong with the compiler.
 We need to add special notation conveying our intentions to the compiler
This is wrong. D language is designed to adopt the safe behaviour (= less bug prone, I am not talking about safeD) on default. So if feel the need of it, then you may add special notation to say the compiler to not use a function named opCmp for comparisons.
 It's like refusing to compile a function because it's not an english word.
This is not the same thing. In Foo there are functions that have exactly the same name as the standard methods, and their signature too is very similar. I am not asking for the compiler to give a warning for a opcompare or even opcmp function. And by the way, what I am asking is an extension of what dmd is already doing, because in many other cases if you try to compile a struct that defined a wrong opCmp or toHash the compiler do shows an error. So I am asking for the error messages to cover this case too. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Sep 20 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4290



--- Comment #8 from Kevin Bealer <kevinbealer gmail.com> 2010-09-20 20:04:28
PDT ---
I think when you are defining opCmp, it's a bit like overriding a function
signature, therefore, if D tries to prevent function hijacking, it makes
perfect sense for it to do the same thing here.

I have no problem with an enforced single interface for opCmp and opEquals and
so on.  I have no problem with having flexible interfaces where const or ref is
optional and up to the user.  I think either of these will work -- and I prefer
the second, actually.  Flexible is good, so long as it is practical to
implement.

But for the compiler to correctly generate code that calls the flexible API
(pick any signature you like) for opCmp when "<" is used, but does not call it
at all when ".sort" is used unless the signature matches a hidden template?

Come on, it can't really be meant to work like that, right?  It doesn't make
sense for the language to have two different behaviors.  It's an inconsistency
that doesn't make any sense from a language or user point of view.

I do think that for value types the .sort and T[] behaviors are best done in
something like a template mechanism, semantically.  Classes can use the
Object.opCmp version but for struct and primitives it makes sense to fully and
automatically specialize the code and build that into the language.  (Or at
least call whichever opCmp is present.)

Kevin

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Sep 20 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4290



--- Comment #9 from Steven Schveighoffer <schveiguy yahoo.com> 2010-09-21
06:16:08 PDT ---
I will say the fact that sort does not use opCmp, but < does is bad.  Worse
than that, it uses a default opCmp which may appear to work but doesn't really.
 However the solution is not to disable other useful declarations of opCmp, but
rather to fix sort.  In fact, I think sort should be deprecated for
std.algorithm.sort, which does use any opCmp defined.

I'd like to see the whole "only save a function pointer in typeinfo if the
signature matches exactly" just go away.  I think we can solve these problems
in better ways.  It's a legacy thing that we can safely get rid of.

Barring that, having a designation so the compiler can make an informed
decision would be the second best option.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Sep 21 2010
prev sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4290



--- Comment #10 from bearophile_hugs eml.cc 2011-04-06 16:56:40 PDT ---
See also bug 1309

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Apr 06 2011