digitalmars.D.bugs - [Issue 4290] New: 'Fragile' opCmp/toHash signature errors
- d-bugmail puremagic.com (85/85) Jun 06 2010 http://d.puremagic.com/issues/show_bug.cgi?id=4290
- d-bugmail puremagic.com (23/23) Sep 15 2010 http://d.puremagic.com/issues/show_bug.cgi?id=4290
- d-bugmail puremagic.com (33/33) Sep 16 2010 http://d.puremagic.com/issues/show_bug.cgi?id=4290
- d-bugmail puremagic.com (20/20) Sep 16 2010 http://d.puremagic.com/issues/show_bug.cgi?id=4290
- d-bugmail puremagic.com (23/32) Sep 17 2010 http://d.puremagic.com/issues/show_bug.cgi?id=4290
- d-bugmail puremagic.com (20/27) Sep 20 2010 The D compiler has to disallow or warn against common traps.
- d-bugmail puremagic.com (15/24) Sep 20 2010 http://d.puremagic.com/issues/show_bug.cgi?id=4290
- d-bugmail puremagic.com (40/45) Sep 20 2010 http://d.puremagic.com/issues/show_bug.cgi?id=4290
- d-bugmail puremagic.com (25/25) Sep 20 2010 http://d.puremagic.com/issues/show_bug.cgi?id=4290
- d-bugmail puremagic.com (15/15) Sep 21 2010 http://d.puremagic.com/issues/show_bug.cgi?id=4290
- d-bugmail puremagic.com (6/6) Apr 06 2011 http://d.puremagic.com/issues/show_bug.cgi?id=4290
http://d.puremagic.com/issues/show_bug.cgi?id=4290 Summary: 'Fragile' opCmp/toHash signature errors Product: D Version: future Platform: All OS/Version: All Status: NEW Keywords: diagnostic Severity: enhancement Priority: P2 Component: DMD AssignedTo: nobody puremagic.com ReportedBy: bearophile_hugs eml.cc This is a reduced version of a bug in my code. This D2 code compiles, but at the end there are two instances of Foo inside the AA. struct Foo { int x; const bool opEquals(ref const Foo f) { return true; } int opCmp(ref const Foo f) { return 0; } hash_t toHash() { return 10; } } void main() { int[Foo] aa; aa[Foo(1)]++; aa[Foo(2)]++; assert(aa.length == 1); // asserts, it's equal to 2 } The problem is caused just by two missing 'const', the following code is correct: struct Foo { int x; const bool opEquals(ref const Foo f) { return true; } int opCmp(ref const Foo f) const { return 0; } hash_t toHash() const { return 10; } } void main() { int[Foo] aa; aa[Foo(1)]++; aa[Foo(2)]++; assert(aa.length == 1); // OK } In my opinion the D2 has to catch such programming errors and refuse opCmp and toHash with a wrong signature, to avoid similar common bugs. See also bug 3844 ----------------------------- An alternative solution is to invent something similar to the 'override' keyword for struct member functions: struct Foo { int x; const bool opEquals(ref const Foo f) { return true; } static_override int opCmp(ref const Foo f) { // error, doesn't override the default opCmp return 0; } static_override hash_t toHash() { // error, doesn't override the default toHash return 10; } } void main() { int[Foo] aa; aa[Foo(1)]++; aa[Foo(2)]++; assert(aa.length == 1); // asserts, it's equal to 2 } But D2 structs don't support inheritance, so static_override is of limited usefulness, it looks over-engineering. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jun 06 2010
http://d.puremagic.com/issues/show_bug.cgi?id=4290 Kevin Bealer <kevinbealer gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |kevinbealer gmail.com PDT --- Just ran into this as well; also, note that the page here gives the wrong version of the signature; it omits the critical const, so anyone running into this problem will be misguided. http://www.digitalmars.com/d/2.0/operatoroverloading.html#compare Also... it looks like some of the standard code such as in std.bitarray does this without the "const", so I imagine the standard library is broken, though I haven't tested whether bitarray's sort properly. I agree with bearophile's recommendation -- there is no value in accepting a signature for "standard" interfaces like opCmp that will not work. One other detail: both of the opCmp signatures seem to work when "<" is used, but for some reason, not with .sort. ?? I'm testing with 2.048. Thanks, Kevin Bealer -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Sep 15 2010
http://d.puremagic.com/issues/show_bug.cgi?id=4290 Steven Schveighoffer <schveiguy yahoo.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |schveiguy yahoo.com 06:10:07 PDT --- Some more info about how the runtime deals with struct "polymorphism" and opCmp and toHash etc. In TypeInfo_Struct, there are the following function pointers: http://www.dsource.org/projects/druntime/browser/trunk/src/object_.d?rev=376#L930 The compiler sets these if you define the appropriate functions with the correct signatures. Then the runtime uses the TypeInfo.compare function to compare two values, or TypeInfo.getHash. The TypeInfo_Struct overrides these functions to check if those function pointers are valid, and if so, call them to compare two values. Otherwise, it does a bit-compare. Any runtime function that uses these functions (including array sort) will be affected. I disagree that the compiler should reject opCmp if it doesn't match the special signature, opCmp is simply a function. There are valid uses of opCmp that do not match the signature. For example, the compiler currently requires struct opEquals to have a signature of bool opEquals(ref const(T) other) const Which means you can't compare two rvalues (bug already filed). This is the mess that will occur if you start requiring special signatures for all the others. However, I think it would be good to add an attribute indicating that the compiler *should* reject the function if it doesn't match, as bearophile suggests. However, I think it should be an attribute, not a keyword. My suggestion is typeinfo to signify that this function belongs in the typeinfo. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Sep 16 2010
http://d.puremagic.com/issues/show_bug.cgi?id=4290 Answer to comment2: I understand what you say only partially. In my opinion silently ignoring the opCmp() and toHash() like in my first example is not acceptable for the D language (that was a bug in my code). The general design of D is to perform safe things on default, and unsafe on request. And here the safer thing is to perform the conformance tests on default and disable them on request. I agree that an attribute is better here. So a notypeinfo seems better, if it's implementable and useful. I don't like the typeinfo/ notypeinfo names a lot because they look too much tied to the underlying implementation, instead to their true semantic meaning/purpose. In LDC there is pragma(no_typeinfo): http://www.dsource.org/projects/ldc/wiki/Docs#no_typeinfo So I think it's better to find a different name for the attribute, different from notypeinfo. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Sep 16 2010
http://d.puremagic.com/issues/show_bug.cgi?id=4290 09:18:52 PDT ---Answer to comment2: I understand what you say only partially. In my opinion silently ignoring the opCmp() and toHash() like in my first example is not acceptable for the D language (that was a bug in my code).But those are valid functions. Why should the D compiler refuse to compile them? See my point about opEquals. This is probably a job for a lint-type tool. To have the compiler assume what you are thinking can be just as bad, in fact even worse, than blindly obeying your instructions.The general design of D is to perform safe things on default, and unsafe on request. And here the safer thing is to perform the conformance tests on default and disable them on request. I agree that an attribute is better here. So a notypeinfo seems better, if it's implementable and useful.This has nothing to do with memory safety, which is what safe D is all about. opCmp and friends are first of all functions, with all the flexibility that functions enjoy. These kinds of problems occur in many places where duck typing is used. For example, you can mis-define some range by doing something like popfront(). The compiler won't refuse to compile this, and shouldn't. But it will refuse to allow it to be passed into a range-accepting function. Where do we draw the line? Note that opCmp can legitimately take all kinds of arguments, and legitimately not be const. One possible solution for this is to get rid of the runtime-properties all together, and just use templates/duck typing fully. Then you get a compile-time error when you define it improperly and use it in something like an AA or sort. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Sep 17 2010
http://d.puremagic.com/issues/show_bug.cgi?id=4290Why should the D compiler refuse to compile them?The D compiler has to disallow or warn against common traps.This is probably a job for a lint-type tool.If some bug may be found at compile-time and it's not too much hard to find it (example: it doesn't require flow analysis), then in my opinion it's a job of the (D) compiler to avoid the bug. Clang compiler designers seem to agree with me.This has nothing to do with memory safety, which is what safe D is all about.I was not talking about memory safety or SafeD, I was talking about code safety in general. Generally D needs to be designed to avoid common bugs, when possible (another example of bug that I'd like D to avoid is to accept strings like "<<>" as operators for the operator overloading string template).For example, you can mis-define some range by doing something like popfront(). The compiler won't refuse to compile this, and shouldn't. But it will refuse to allow it to be passed into a range-accepting function. Where do we draw the line?That's a nice example. I don't have a generic answer for your question, but to let this function be accepted and silently ignored: int opCmp(ref const Foo f) { because this is the only correct one: int opCmp(ref const Foo f) const { is too much bug-prone, in my opinion. This is beyond the line for me. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Sep 20 2010
http://d.puremagic.com/issues/show_bug.cgi?id=4290 06:03:43 PDT ---But opCmp is a valid function! I can still call it, even though it's not put into the typeinfo.Why should the D compiler refuse to compile them?The D compiler has to disallow or warn against common traps.But it's not a bug! opCmp is a valid symbol name for a function, and can be called with the arguments you specify. In fact, it even works with the comparison operators. There is nothing inherently special about opCmp. We need to add special notation conveying our intentions to the compiler (that it should be used in the TypeInfo). It's like refusing to compile a function because it's not an english word. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------This is probably a job for a lint-type tool.If some bug may be found at compile-time and it's not too much hard to find it (example: it doesn't require flow analysis), then in my opinion it's a job of the (D) compiler to avoid the bug. Clang compiler designers seem to agree with me.
Sep 20 2010
http://d.puremagic.com/issues/show_bug.cgi?id=4290 It seems we are in irreducible disagreement here.But opCmp is a valid function! I can still call it, even though it's not putinto the typeinfo. Of course. But when it is well formed it is also a *special* function, because the compiler uses it automatically in some useful situations.But it's not a bug!If you define a Foo struct like this, it' quite probably a bug in your code: struct Foo { int x; const bool opEquals(ref const Foo f) { return true; } int opCmp(ref const Foo f) { return 0; } hash_t toHash() { return 10; } }There is nothing inherently special about opCmp.It's special because there is a contract between the compiler and the programmer, that such functions are used for example when you sort an array of structs or you put them in an associative array. If the compiler in some situations silently doesn't use such functions when the programmer surely intended them to be used that way by the compiler, then there's something wrong with the compiler.We need to add special notation conveying our intentions to the compilerThis is wrong. D language is designed to adopt the safe behaviour (= less bug prone, I am not talking about safeD) on default. So if feel the need of it, then you may add special notation to say the compiler to not use a function named opCmp for comparisons.It's like refusing to compile a function because it's not an english word.This is not the same thing. In Foo there are functions that have exactly the same name as the standard methods, and their signature too is very similar. I am not asking for the compiler to give a warning for a opcompare or even opcmp function. And by the way, what I am asking is an extension of what dmd is already doing, because in many other cases if you try to compile a struct that defined a wrong opCmp or toHash the compiler do shows an error. So I am asking for the error messages to cover this case too. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Sep 20 2010
http://d.puremagic.com/issues/show_bug.cgi?id=4290 PDT --- I think when you are defining opCmp, it's a bit like overriding a function signature, therefore, if D tries to prevent function hijacking, it makes perfect sense for it to do the same thing here. I have no problem with an enforced single interface for opCmp and opEquals and so on. I have no problem with having flexible interfaces where const or ref is optional and up to the user. I think either of these will work -- and I prefer the second, actually. Flexible is good, so long as it is practical to implement. But for the compiler to correctly generate code that calls the flexible API (pick any signature you like) for opCmp when "<" is used, but does not call it at all when ".sort" is used unless the signature matches a hidden template? Come on, it can't really be meant to work like that, right? It doesn't make sense for the language to have two different behaviors. It's an inconsistency that doesn't make any sense from a language or user point of view. I do think that for value types the .sort and T[] behaviors are best done in something like a template mechanism, semantically. Classes can use the Object.opCmp version but for struct and primitives it makes sense to fully and automatically specialize the code and build that into the language. (Or at least call whichever opCmp is present.) Kevin -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Sep 20 2010
http://d.puremagic.com/issues/show_bug.cgi?id=4290 06:16:08 PDT --- I will say the fact that sort does not use opCmp, but < does is bad. Worse than that, it uses a default opCmp which may appear to work but doesn't really. However the solution is not to disable other useful declarations of opCmp, but rather to fix sort. In fact, I think sort should be deprecated for std.algorithm.sort, which does use any opCmp defined. I'd like to see the whole "only save a function pointer in typeinfo if the signature matches exactly" just go away. I think we can solve these problems in better ways. It's a legacy thing that we can safely get rid of. Barring that, having a designation so the compiler can make an informed decision would be the second best option. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Sep 21 2010
http://d.puremagic.com/issues/show_bug.cgi?id=4290 See also bug 1309 -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 06 2011