www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Clang error recovery

reply bearophile <bearophileHUGS lycos.com> writes:
This is the latest post on the LLVM blog, "Amazing feats of Clang Error
Recovery", by Chris Lattner:
http://blog.llvm.org/2010/04/amazing-feats-of-clang-error-recovery.html

I've compared dmd to few of those examples of Clang usage. I don't comment each
one of those things because some of them are specific of C++, and because I
don't understand some other of them. If you can write better translations to D
or if you understand more of them, you can add and show me more comparisons :-)

=============================

Clang:


int foo(int x, pid_t y) {
  return x+y;
}


t.c:1:16: error: unknown type name 'pid_t'
int foo(int x, pid_t y) {
               ^

-----------------

dmd 2.042:

int foo(int x, pid_t y) {
  return x+y;
}
void main() {}


temp.d(1): Error: identifier 'pid_t' is not defined
temp.d(1): Error: pid_t is used as a type
temp.d(1): Error: cannot have parameter of type void

-----------------

Here Clangs gives a single error message (and it gives the error column).
Here the errors given by dmd give the same information.
Here dmd gives error messages that are better than GCC 4.2 ones, but I think a
single good error message is better than three.

=============================

Clang:


#include <inttypes.h>
int64 x;


t.c:2:1: error: unknown type name 'int64'; did you mean 'int64_t'?
int64 x;
^~~~~
int64_t

-----------------

dmd:

I am not sure if this is the same situation:

alias uint uint64_t;
int foo(uint64 x) {
  return x * 2;
}
void main() {}


dmd prints:

temp.d(2): Error: identifier 'uint64' is not defined
temp.d(2): Error: uint64 is used as a type
temp.d(2): Error: cannot have parameter of type void


That page says:

Code that later used 'x', for example, knows that it is declared as an int64_t,
so it doesn't lead to other weird follow on errors that don't make any sense.<

But I don't understand what it means. Maybe this is why this blog post is titled "Clang Error Recovery" instead of "Clang Error Messages". Later in the same post it says something similar:
In addition to getting the error message right (and suggesting a fixit
replacement to "::"), Clang "knows what you mean" so it handles the subsequent
uses of a2 correctly.<

============================= Clang: namespace foo { struct x { int y; }; } namespace bar { typedef int y; } void test() { foo::x a; bar::y b; a + b; } t.cc:10:5: error: invalid operands to binary expression ('foo::x' and 'bar::y' (aka 'int')) a + b; ~ ^ ~ ----------------- dmd: D doesn't have the namespaces, so I have adapted that code like this: struct foo { static struct x { int y; }; } struct bar { typedef int y; } void test() { foo.x a; bar.y b; a + b; } void main() {} dmd prints: temp.d(10): Error: incompatible types for ((a) + (b)): 'x' and 'y' temp.d(10): Error: + has no effect in expression ((__error) + (__error)) Bye, bearophile
Apr 06 2010
next sibling parent reply Robert Clipsham <robert octarineparrot.com> writes:
On 06/04/10 13:05, bearophile wrote:
 Here Clangs gives a single error message (and it gives the error column).

The error column won't happen in DMD, Walter has mentioned many times before that no one ever commented on the feature, and no one seemed to care when it disappeared... Maybe this will change as clang increases in popularity? That said, with properly formatted code it's generally not hard to find where your error is, as there isn't much on a line. I also like the idea of giving a single error message. Most of the time when I have a nice list of errors I skim through and look for identifier names where I forgot to import a module, or pick out some line numbers that seem to have to first error on and jump to them in my editor... Both of these would be easier to do with only one error message.
 Here the errors given by dmd give the same information.
 Here dmd gives error messages that are better than GCC 4.2 ones, but I think a
single good error message is better than three.

I agree here, those extra errors can be a pain sometimes when you're trying to find the actual cause.
 temp.d(2): Error: identifier 'uint64' is not defined
 temp.d(2): Error: uint64 is used as a type
 temp.d(2): Error: cannot have parameter of type void


 That page says:

 Code that later used 'x', for example, knows that it is declared as an
int64_t, so it doesn't lead to other weird follow on errors that don't make any
sense.<


I don't think the spell checker has been implemented for types, just identifiers... If this is the case then it shouldn't take much to add in the spell checking. As for later error messages, that will take more effort, although I would imagine it would just require changing the type to the spell checked type internally to avid the later errors.
 But I don't understand what it means. Maybe this is why this blog post is
titled "Clang Error Recovery" instead of "Clang Error Messages".

It means that clang knows the type give is incorrect, so when it continues analyzing the code it will pretend you gave (what it thinks is) the right type, rather than giving more errors because you gave the incorrect type.
 Later in the same post it says something similar:

 In addition to getting the error message right (and suggesting a fixit
replacement to "::"), Clang "knows what you mean" so it handles the subsequent
uses of a2 correctly.<


This is the same thing as above.
 temp.d(10): Error: incompatible types for ((a) + (b)): 'x' and 'y'
 temp.d(10): Error: + has no effect in expression ((__error) + (__error))

Other than the column, this gives roughly the same information. I guess the only way to improve here would be to remove the second error, but it's not really much of an issue.
 Bye,
 bearophile

Apr 06 2010
parent reply bearophile <bearophileHUGS lycos.com> writes:
Thank you for your comments, Robert Clipsham.

The error column won't happen in DMD,<

The C# dotnet compiler too shows the error column. But I agree with you that it's not so important. Few possible improvements of Dmd error messages, from that article, from your answers and from my experience: - A compiler switch to stop the compilation after the first or few first error messages; - Use the true Levenshtein distance to find the typing errors; - Spell checker for types too; - Maybe, as you suggest, changing the type to the spell checked type internally to avoid some of the later errors; - Printing less error messages, increasing their semantic density. This is not easy to do; - In Bugzilla I have added some bug reports that list specific situations where the error message can be improved (in theory even I can fix some of them. In the meantime one of them have being fixed by Don and Walter). Bye, bearophile
Apr 06 2010
parent reply Robert Clipsham <robert octarineparrot.com> writes:
On 06/04/10 19:19, bearophile wrote:
 Few possible improvements of Dmd error messages, from that article, from your
answers and from my experience:
 - A compiler switch to stop the compilation after the first or few first error
messages;

I really don't like this option, if there needs to be an option for it the compiler is doing something wrong. On posix based systems you can use: dmd myFileWithErrors.d |& head To replicate this if you want it, I don't know about windows.
 - Use the true Levenshtein distance to find the typing errors;

The suggestions I've received for misspelled types has been pretty good, I'm not sure what advantage this would give. I'd agree that the proper way to do it should be used though, particularly if it gives better suggestions.
 - Spell checker for types too;
 - Maybe, as you suggest, changing the type to the spell checked type
internally to avoid some of the later errors;

Agreed, neither of these should be too hard to add should someone feel inclined to do so.
 - Printing less error messages, increasing their semantic density. This is not
easy to do;

This would also be nice, but I think the effort required to add it is too much for us to worry about at this stage, there are far more important things to work on.
Apr 06 2010
parent bearophile <bearophileHUGS lycos.com> writes:
Robert Clipsham:

 I really don't like this option, if there needs to be an option for it 
 the compiler is doing something wrong. On posix based systems you can use:
 dmd myFileWithErrors.d |& head

GCC has that option, it's named -Wfatal-errors
 To replicate this if you want it, I don't know about windows.

I use GNU head on Windows too: http://sourceforge.net/projects/unxutils/ A small problem with head is that the compilation goes on, and it can take some time to stop, while -Wfatal-errors stops the compiler quickly. Bye, bearophile
Apr 06 2010
prev sibling next sibling parent reply Ary Borenszweig <ary esperanto.org.ar> writes:
bearophile wrote:
 This is the latest post on the LLVM blog, "Amazing feats of Clang Error
Recovery", by Chris Lattner:
 http://blog.llvm.org/2010/04/amazing-feats-of-clang-error-recovery.html
 
 I've compared dmd to few of those examples of Clang usage. I don't comment
each one of those things because some of them are specific of C++, and because
I don't understand some other of them. If you can write better translations to
D or if you understand more of them, you can add and show me more comparisons
:-)
 
 =============================
 
 Clang:
 
 
 int foo(int x, pid_t y) {
   return x+y;
 }
 
 
 t.c:1:16: error: unknown type name 'pid_t'
 int foo(int x, pid_t y) {
                ^
 
 -----------------
 
 dmd 2.042:
 
 int foo(int x, pid_t y) {
   return x+y;
 }
 void main() {}
 
 
 temp.d(1): Error: identifier 'pid_t' is not defined
 temp.d(1): Error: pid_t is used as a type
 temp.d(1): Error: cannot have parameter of type void
 
 -----------------
 
 Here Clangs gives a single error message (and it gives the error column).
 Here the errors given by dmd give the same information.
 Here dmd gives error messages that are better than GCC 4.2 ones, but I think a
single good error message is better than three.

dmd has an ErrorType when an expression or something gives an error. The problem is it is a kind of an alias of an int type, so it continues to give errors. If such ErrorType would not trigger errors anymore, that would solve the problem. (in some cases I think a void type is returned instead of an error type)
Apr 06 2010
parent reply Don <nospam nospam.com> writes:
Ary Borenszweig wrote:
 bearophile wrote:
 This is the latest post on the LLVM blog, "Amazing feats of Clang 
 Error Recovery", by Chris Lattner:
 http://blog.llvm.org/2010/04/amazing-feats-of-clang-error-recovery.html

 I've compared dmd to few of those examples of Clang usage. I don't 
 comment each one of those things because some of them are specific of 
 C++, and because I don't understand some other of them. If you can 
 write better translations to D or if you understand more of them, you 
 can add and show me more comparisons :-)

 =============================

 Clang:


 int foo(int x, pid_t y) {
   return x+y;
 }


 t.c:1:16: error: unknown type name 'pid_t'
 int foo(int x, pid_t y) {
                ^

 -----------------

 dmd 2.042:

 int foo(int x, pid_t y) {
   return x+y;
 }
 void main() {}


 temp.d(1): Error: identifier 'pid_t' is not defined
 temp.d(1): Error: pid_t is used as a type
 temp.d(1): Error: cannot have parameter of type void

 -----------------

 Here Clangs gives a single error message (and it gives the error column).
 Here the errors given by dmd give the same information.
 Here dmd gives error messages that are better than GCC 4.2 ones, but I 
 think a single good error message is better than three.

dmd has an ErrorType when an expression or something gives an error. The problem is it is a kind of an alias of an int type, so it continues to give errors. If such ErrorType would not trigger errors anymore, that would solve the problem. (in some cases I think a void type is returned instead of an error type)

There's also an ErrorExpression which is used in many places, and it generally works properly in supressing errors. (__error shows up in error messages when it hasn't been treated properly).
Apr 06 2010
parent reply Walter Bright <newshound1 digitalmars.com> writes:
Don wrote:
 There's also an ErrorExpression which is used in many places, and it 
 generally works properly in supressing errors. (__error shows up in 
 error messages when it hasn't been treated properly).

Attempting to correct the error and move forward with the compilation sounds good, but generally is a hopeless failure. A far better approach, one that is half-implemented in dmd, is to replace failed types, expressions, etc., with special error productions, and then suppress further messages that have as operands one of those error productions. It's analogous to using NaNs in floating point.
Apr 06 2010
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 04/06/2010 04:52 PM, Walter Bright wrote:
 Don wrote:
 There's also an ErrorExpression which is used in many places, and it
 generally works properly in supressing errors. (__error shows up in
 error messages when it hasn't been treated properly).

Attempting to correct the error and move forward with the compilation sounds good, but generally is a hopeless failure. A far better approach, one that is half-implemented in dmd, is to replace failed types, expressions, etc., with special error productions, and then suppress further messages that have as operands one of those error productions. It's analogous to using NaNs in floating point.

NaP should be its name I guess :o). Andrei
Apr 06 2010
prev sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
bearophile wrote:
 Clang:
 
 
 #include <inttypes.h>
 int64 x;
 
 
 t.c:2:1: error: unknown type name 'int64'; did you mean 'int64_t'?
 int64 x;
 ^~~~~
 int64_t
 
 -----------------
 
 dmd:
 
 I am not sure if this is the same situation:
 
 alias uint uint64_t;
 int foo(uint64 x) {
   return x * 2;
 }
 void main() {}
 
 
 dmd prints:
 
 temp.d(2): Error: identifier 'uint64' is not defined

dmd's spell checker only looks a distance of one, and uint64 is a distance of two from uint64_t. This is trivially changed, but I didn't do the longer distances because of the annoyances of false positives - variable name spelling doesn't work like english language spelling. The issue is not, as has been suggested, that dmd doesn't do spelling checks on types. There's really nothing "amazing" about a spell checker, it's just a better idea than not doing it.
Apr 06 2010
parent reply Brad Roberts <braddr slice-2.puremagic.com> writes:
On Tue, 6 Apr 2010, Walter Bright wrote:

 bearophile wrote:
 Clang:
 
 
 #include <inttypes.h>
 int64 x;
 
 
 t.c:2:1: error: unknown type name 'int64'; did you mean 'int64_t'?
 int64 x;
 ^~~~~
 int64_t
 
 -----------------
 
 dmd:
 
 I am not sure if this is the same situation:
 
 alias uint uint64_t;
 int foo(uint64 x) {
   return x * 2;
 }
 void main() {}
 
 
 dmd prints:
 
 temp.d(2): Error: identifier 'uint64' is not defined

dmd's spell checker only looks a distance of one, and uint64 is a distance of two from uint64_t. This is trivially changed, but I didn't do the longer distances because of the annoyances of false positives - variable name spelling doesn't work like english language spelling. The issue is not, as has been suggested, that dmd doesn't do spelling checks on types. There's really nothing "amazing" about a spell checker, it's just a better idea than not doing it.

Consider trying increasing distances (with some relatively low max). If you hit a single suggestable correction, substitute it. ie, for uint64, nothing at 0 or 1, one at 2 (uint64_t) so use it (but still error, obviously). This could be particularly useful for simple 2 letter transpositions, if those are found by the checker.. a common thing for a lot of people for length, for instance. Later, Brad
Apr 06 2010
parent Walter Bright <newshound1 digitalmars.com> writes:
Brad Roberts wrote:
 Consider trying increasing distances (with some relatively low max).  If 
 you hit a single suggestable correction, substitute it.  ie, for uint64, 
 nothing at 0 or 1, one at 2 (uint64_t) so use it (but still error, 
 obviously).

Of course if you do more distances, pick the shortest match!
 
 This could be particularly useful for simple 2 letter transpositions, if 
 those are found by the checker.. a common thing for a lot of people for 
 length, for instance.

Transpositions count as 1. See the dmd source file speller.c.
Apr 06 2010