www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - -cov LOC is inadequate for 1 liner branching;

reply Timothee Cour <thelastmammoth gmail.com> writes:
just filed https://issues.dlang.org/show_bug.cgi?id=18377:

`dmd -cov -run main.d` shows 100% coverage; this is misleading since a
branch is not taken:

```
void main(){
  int a;
  if(false) a+=10;
}
```

how about adding a `-covmode=[loc|branch]` that would allow either
reporting LOC coverage or branch coverage?

branch coverage would report number of branches taken at least once /
total number of branches.

It would not only address the above issue, but it is IMO a much better
metric for coverage, less sensitive to 'overcounting' of large blocks
in main code branches (size of code block in a branch is irrelevant as
far as testing is concerned); eg:

```
int fun(int x){
  if(x<0)
    return fun2();  // accounts for 1 LOC and 1 branch
  // long block of non-branching code here... // accounts for 10 LOC
and 1 branch
}
```

NOTE: branches would include anything that allows more than 1 code
path (eg: switch, if)
Feb 05 2018
next sibling parent timotheecour <timothee.cour2 gmail.com> writes:
On Monday, 5 February 2018 at 19:32:37 UTC, Timothee Cour wrote:
 just filed https://issues.dlang.org/show_bug.cgi?id=18377:
wilzbach
 It probably makes sense to have a look at how other languages 
 are dealing with this. Seems to be a common problem
some links: * https://www.ncover.com/support/docs/extras/code-coverage/understan ing-branch-coverage => "Why do we need the Branch Coverage metric?" * http://coverage.readthedocs.io/en/coverage-4.4.2/branch.html => "coverage run --branch myprog.py"
Feb 11 2018
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 2/5/2018 11:32 AM, Timothee Cour wrote:
 just filed https://issues.dlang.org/show_bug.cgi?id=18377:
 
 `dmd -cov -run main.d` shows 100% coverage; this is misleading since a
 branch is not taken:
 
 ```
 void main(){
    int a;
    if(false) a+=10;
 }
 ```
Consider how -cov works today: 2| x = 3; y = 4; 1| ++x; The first line has a count of 2, because there are two statements and each contributes one increment to the line. 1| x = 3; // reference count 2| if (true && false) c; 3| if (true && true) c; 1| if (false && b) c; The sequence points are each counted separately. So, by comparing with the 'reference count', you can see which sequence points are executed. Also, if one finds that unattractive, the code can be organized like: 1| if (true && 1| false) 0| c; and the separate counts will be revealed instead of aggregated. I agree that this is not ideal, however: 1. it works 2. it is simple and robust 3. the display to the user is simple 4. it's easy to aggregate multiple runs together with simple text processing code 5. one can 'fix' it with a stylistic change in the formatting of the source code Any further improvement would be a large increase in complexity of the implementation, and I don't know of reasonable way to present this to the user in a textual format. Is it worth it? I don't think so. Like builtin unittests, the big win with -cov is it is *trivial* to use, which encourages its adoption. It's a 99% solution, with 99% of the benefits, with 1% of the implementation effort. We should be expending effort elsewhere than putting an additional 99% effort to squeeze out that last 1% of benefit.
Feb 11 2018
parent reply Timothee Cour <thelastmammoth gmail.com> writes:
I think you're missing my main point: it's explained here
https://www.ncover.com/support/docs/extras/code-coverage/understanding-branch-coverage
but the gist is that line based coverage is over-counting:
```
if(A)
  // 100 lines of code
else
  // 1 line of code
```
gives a line coverage of ~ 99% vs a branch coverage of ~50%
(assuming `else` branch never covered in unittests)

What matters as far as bugs are concerned is that 50% of cases are
covered. Increasing the size of the `if(A)` branch increases line
coverage (which is irrelevant) but not branch coverage.


On Sun, Feb 11, 2018 at 1:32 PM, Walter Bright via Digitalmars-d
<digitalmars-d puremagic.com> wrote:
 On 2/5/2018 11:32 AM, Timothee Cour wrote:
 just filed https://issues.dlang.org/show_bug.cgi?id=18377:

 `dmd -cov -run main.d` shows 100% coverage; this is misleading since a
 branch is not taken:

 ```
 void main(){
    int a;
    if(false) a+=10;
 }
 ```
Consider how -cov works today: 2| x = 3; y = 4; 1| ++x; The first line has a count of 2, because there are two statements and each contributes one increment to the line. 1| x = 3; // reference count 2| if (true && false) c; 3| if (true && true) c; 1| if (false && b) c; The sequence points are each counted separately. So, by comparing with the 'reference count', you can see which sequence points are executed. Also, if one finds that unattractive, the code can be organized like: 1| if (true && 1| false) 0| c; and the separate counts will be revealed instead of aggregated. I agree that this is not ideal, however: 1. it works 2. it is simple and robust 3. the display to the user is simple 4. it's easy to aggregate multiple runs together with simple text processing code 5. one can 'fix' it with a stylistic change in the formatting of the source code Any further improvement would be a large increase in complexity of the implementation, and I don't know of reasonable way to present this to the user in a textual format. Is it worth it? I don't think so. Like builtin unittests, the big win with -cov is it is *trivial* to use, which encourages its adoption. It's a 99% solution, with 99% of the benefits, with 1% of the implementation effort. We should be expending effort elsewhere than putting an additional 99% effort to squeeze out that last 1% of benefit.
Feb 11 2018
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 2/11/2018 1:55 PM, Timothee Cour wrote:
 I think you're missing my main point: it's explained here
 https://www.ncover.com/support/docs/extras/code-coverage/understanding-branch-coverage
 but the gist is that line based coverage is over-counting:
 ```
 if(A)
    // 100 lines of code
 else
    // 1 line of code
 ```
 gives a line coverage of ~ 99% vs a branch coverage of ~50%
 (assuming `else` branch never covered in unittests)
 
 What matters as far as bugs are concerned is that 50% of cases are
 covered. Increasing the size of the `if(A)` branch increases line
 coverage (which is irrelevant) but not branch coverage.
I understand that point. The -cov coverage percentage is the line coverage, not the sequence point coverage. (Hence it will never be greater than 100%, and it will never underestimate the coverage. It would be more accurately termed an "upper bound" on the coverage.) And yes, that makes it fuzzy in proportion to how many lines contain multiple sequence points. Eliminating that fuzziness does require a vast increase in the complexity of the -cov implementation.
Feb 11 2018
parent reply Timothee Cour <thelastmammoth gmail.com> writes:
 -cov coverage percentage is the line coverage, not the sequence point coverage
[...]
 makes it fuzzy in proportion to how many lines contain multiple sequence points
Based on your comment I'm pretty sure you're still not getting my point, so apologies if I was unclear and let me try to explain better: it's not about `line coverage` vs `sequence point coverage`, as that difference is not very large (indeed, just 'fuzzy'). It's about `line coverage` vs `branch coverage` (see exact definition in linked article), that difference is very large in practice. here's my example, but more concretely explained: ``` void main(int a){ if(a>0){ statement1(); // line 3 statement2(); // line 4 ... statement100(); // line 102 } else{ statement101(); // line 104 } } unittest{ fun(1); } ``` * line coverage is around 99%. * sequence point coverage is also 99% (and would be close to that if some lines had multiple statements) * branch coverage is 50%. This is not an artificial example, this is the common case. What's more, code instrumentation to enable branch coverage is not more complex to implement compared to line coverage (I would even venture it's less complex and less costly). On Sun, Feb 11, 2018 at 2:32 PM, Walter Bright via Digitalmars-d <digitalmars-d puremagic.com> wrote:
 On 2/11/2018 1:55 PM, Timothee Cour wrote:
 I think you're missing my main point: it's explained here

 https://www.ncover.com/support/docs/extras/code-coverage/understanding-branch-coverage
 but the gist is that line based coverage is over-counting:
 ```
 if(A)
    // 100 lines of code
 else
    // 1 line of code
 ```
 gives a line coverage of ~ 99% vs a branch coverage of ~50%
 (assuming `else` branch never covered in unittests)

 What matters as far as bugs are concerned is that 50% of cases are
 covered. Increasing the size of the `if(A)` branch increases line
 coverage (which is irrelevant) but not branch coverage.
I understand that point. The -cov coverage percentage is the line coverage, not the sequence point coverage. (Hence it will never be greater than 100%, and it will never underestimate the coverage. It would be more accurately termed an "upper bound" on the coverage.) And yes, that makes it fuzzy in proportion to how many lines contain multiple sequence points. Eliminating that fuzziness does require a vast increase in the complexity of the -cov implementation.
Feb 11 2018
parent Walter Bright <newshound2 digitalmars.com> writes:
On 2/11/2018 3:01 PM, Timothee Cour wrote:
 It's about `line coverage` vs `branch coverage` (see exact definition
 in linked article),
Ok, I see now, thanks for your patience and the clarification.
 that difference is very large in practice.
Only if one cares about the percentage metric. Pragmatically, I'm much more interested in exactly what was not covered rather than the percentage, and any good QA department should be focused on that rather than the percent metric. BTW, when I wrote the -cov implementation, I threw in the % metric just for fun <g>.
 What's more, code instrumentation to enable branch coverage is not
 more complex to implement compared to line coverage (I would even
 venture it's less complex and less costly).
I'm not convinced of that, as you've still got the: if (a) b; problem. Furthermore, in the first link they alluded to the thrown exception problem, with some gobbledy-gook answer on how they dealt with it (they punted). Sequence point coverage doesn't have that problem. Next, in the python linked example, they found it necessary to add a bunch of pragma's to give needed information to the branch counter. I have a hard time seeing anyone using that unless they're forced to by management. The idea with dmd's coverage is you type -cov and you get a coverage report. Done. The coverage report is a simple text file that is both readable by itself and trivially parse-able with a program for those who really want a GUI presentation of the results, or who want to integrate it with github (as has been done for us, yay!). (Back in the olden daze, a coverage analyzer was a separate product that came with a 200 page manual. Who bought it? Managers. Who used it? Nobody. You'd see the manual on programmers' shelves, still in its shrink wrap.) --- I'm actually glad to see people complain about -cov and want to improve it. It shows it is successful - people are using it! As Bjarne Stroustrup once said, there are two kinds of products - ones people complain about, and ones people don't use.
Feb 11 2018
prev sibling parent Johan Engelen <j j.nl> writes:
On Monday, 5 February 2018 at 19:32:37 UTC, Timothee Cour wrote:
 just filed https://issues.dlang.org/show_bug.cgi?id=18377:
If I may add something semi-relevant to the current discussion, have a look here: https://clang.llvm.org/docs/SourceBasedCodeCoverage.html It is non-trivial to add llvm-cov support to LDC, but half the work is already done in LDC [*]. The current instrumentation implementation in Clang and LDC for PGO/coverage only instruments branch points, and is therefore faster than DMD's `-cov` (also present in LDC) but less precise: it assumes an exception-less execution flow. It's however quite easy to add precision and deal with exception-flow, at the cost of execution speed of course. It'll be interesting to investigate the cost/benefit! Let me know if you're interested in working on it. A copy from Clang's documentation page, from which some cool ideas/capabilities can be glanced (needs monospaced font): ``` 1| 20|#define BAR(x) ((x) || (x)) ^20 ^2 2| 2|template <typename T> void foo(T x) { 3| 22| for (unsigned I = 0; I < 10; ++I) { BAR(I); } ^22 ^20 ^20^20 4| 2|} ------------------ | void foo<int>(int): | 2| 1|template <typename T> void foo(T x) { | 3| 11| for (unsigned I = 0; I < 10; ++I) { BAR(I); } | ^11 ^10 ^10^10 | 4| 1|} ------------------ | void foo<float>(int): | 2| 1|template <typename T> void foo(T x) { | 3| 11| for (unsigned I = 0; I < 10; ++I) { BAR(I); } | ^11 ^10 ^10^10 | 4| 1|} ------------------ ``` cheers, Johan [*] PGO instrumentation is the same, I can give some guidance to anybody who wants to work on it.
Feb 12 2018