digitalmars.D - -cov LOC is inadequate for 1 liner branching;

Timothee Cour (27/27) Feb 05 2018 just filed https://issues.dlang.org/show_bug.cgi?id=18377:

timotheecour (7/10) Feb 11 2018 some links:
Walter Bright (31/42) Feb 11 2018 Consider how -cov works today:

Timothee Cour (16/61) Feb 11 2018 I think you're missing my main point: it's explained here

Walter Bright (8/23) Feb 11 2018 I understand that point. The -cov coverage percentage is the line covera...

Timothee Cour (30/57) Feb 11 2018 Based on your comment I'm pretty sure you're still not getting my

Walter Bright (27/33) Feb 11 2018 Only if one cares about the percentage metric. Pragmatically, I'm much m...

Johan Engelen (41/42) Feb 12 2018 If I may add something semi-relevant to the current discussion,

Timothee Cour <thelastmammoth gmail.com> writes:

just filed https://issues.dlang.org/show_bug.cgi?id=18377:

`dmd -cov -run main.d` shows 100% coverage; this is misleading since a
branch is not taken:

```
void main(){
  int a;
  if(false) a+=10;
}
```

how about adding a `-covmode=[loc|branch]` that would allow either
reporting LOC coverage or branch coverage?

branch coverage would report number of branches taken at least once /
total number of branches.

It would not only address the above issue, but it is IMO a much better
metric for coverage, less sensitive to 'overcounting' of large blocks
in main code branches (size of code block in a branch is irrelevant as
far as testing is concerned); eg:

```
int fun(int x){
  if(x<0)
    return fun2();  // accounts for 1 LOC and 1 branch
  // long block of non-branching code here... // accounts for 10 LOC
and 1 branch
}
```

NOTE: branches would include anything that allows more than 1 code
path (eg: switch, if)

Feb 05 2018

timotheecour <timothee.cour2 gmail.com> writes:

On Monday, 5 February 2018 at 19:32:37 UTC, Timothee Cour wrote:
 just filed https://issues.dlang.org/show_bug.cgi?id=18377:

 wilzbach
 It probably makes sense to have a look at how other languages 
 are dealing with this. Seems to be a common problem

some links:
* 
https://www.ncover.com/support/docs/extras/code-coverage/understan
ing-branch-coverage => "Why do we need the Branch Coverage metric?"
* http://coverage.readthedocs.io/en/coverage-4.4.2/branch.html => 
"coverage run --branch myprog.py"

Feb 11 2018

Walter Bright <newshound2 digitalmars.com> writes:

On 2/5/2018 11:32 AM, Timothee Cour wrote:
 just filed https://issues.dlang.org/show_bug.cgi?id=18377:
 
 `dmd -cov -run main.d` shows 100% coverage; this is misleading since a
 branch is not taken:
 
 ```
 void main(){
    int a;
    if(false) a+=10;
 }
 ```

Consider how -cov works today:

   2|  x = 3; y = 4;
   1|  ++x;

The first line has a count of 2, because there are two statements and each 
contributes one increment to the line.

   1|  x = 3; // reference count
   2|  if (true && false) c;
   3|  if (true && true) c;
   1|  if (false && b) c;

The sequence points are each counted separately. So, by comparing with the 
'reference count', you can see which sequence points are executed. Also, if one 
finds that unattractive, the code can be organized like:

   1|  if (true &&
   1|      false)
   0|     c;

and the separate counts will be revealed instead of aggregated.

I agree that this is not ideal, however:

1. it works
2. it is simple and robust
3. the display to the user is simple
4. it's easy to aggregate multiple runs together with simple text processing
code
5. one can 'fix' it with a stylistic change in the formatting of the source code

Any further improvement would be a large increase in complexity of the 
implementation, and I don't know of reasonable way to present this to the user 
in a textual format.

Is it worth it? I don't think so. Like builtin unittests, the big win with -cov 
is it is *trivial* to use, which encourages its adoption. It's a 99% solution, 
with 99% of the benefits, with 1% of the implementation effort. We should be 
expending effort elsewhere than putting an additional 99% effort to squeeze out 
that last 1% of benefit.

Feb 11 2018

Timothee Cour <thelastmammoth gmail.com> writes:

I think you're missing my main point: it's explained here
https://www.ncover.com/support/docs/extras/code-coverage/understanding-branch-coverage
but the gist is that line based coverage is over-counting:
```
if(A)
  // 100 lines of code
else
  // 1 line of code
```
gives a line coverage of ~ 99% vs a branch coverage of ~50%
(assuming `else` branch never covered in unittests)

What matters as far as bugs are concerned is that 50% of cases are
covered. Increasing the size of the `if(A)` branch increases line
coverage (which is irrelevant) but not branch coverage.


On Sun, Feb 11, 2018 at 1:32 PM, Walter Bright via Digitalmars-d
<digitalmars-d puremagic.com> wrote:
 On 2/5/2018 11:32 AM, Timothee Cour wrote:
 just filed https://issues.dlang.org/show_bug.cgi?id=18377:

 `dmd -cov -run main.d` shows 100% coverage; this is misleading since a
 branch is not taken:

 ```
 void main(){
    int a;
    if(false) a+=10;
 }
 ```


 Consider how -cov works today:

   2|  x = 3; y = 4;
   1|  ++x;

 The first line has a count of 2, because there are two statements and each
 contributes one increment to the line.

   1|  x = 3; // reference count
   2|  if (true && false) c;
   3|  if (true && true) c;
   1|  if (false && b) c;

 The sequence points are each counted separately. So, by comparing with the
 'reference count', you can see which sequence points are executed. Also, if
 one finds that unattractive, the code can be organized like:

   1|  if (true &&
   1|      false)
   0|     c;

 and the separate counts will be revealed instead of aggregated.

 I agree that this is not ideal, however:

 1. it works
 2. it is simple and robust
 3. the display to the user is simple
 4. it's easy to aggregate multiple runs together with simple text processing
 code
 5. one can 'fix' it with a stylistic change in the formatting of the source
 code

 Any further improvement would be a large increase in complexity of the
 implementation, and I don't know of reasonable way to present this to the
 user in a textual format.

 Is it worth it? I don't think so. Like builtin unittests, the big win with
 -cov is it is *trivial* to use, which encourages its adoption. It's a 99%
 solution, with 99% of the benefits, with 1% of the implementation effort. We
 should be expending effort elsewhere than putting an additional 99% effort
 to squeeze out that last 1% of benefit.

Feb 11 2018

Walter Bright <newshound2 digitalmars.com> writes:

On 2/11/2018 1:55 PM, Timothee Cour wrote:
 I think you're missing my main point: it's explained here
 https://www.ncover.com/support/docs/extras/code-coverage/understanding-branch-coverage
 but the gist is that line based coverage is over-counting:
 ```
 if(A)
    // 100 lines of code
 else
    // 1 line of code
 ```
 gives a line coverage of ~ 99% vs a branch coverage of ~50%
 (assuming `else` branch never covered in unittests)
 
 What matters as far as bugs are concerned is that 50% of cases are
 covered. Increasing the size of the `if(A)` branch increases line
 coverage (which is irrelevant) but not branch coverage.

I understand that point. The -cov coverage percentage is the line coverage, not 
the sequence point coverage. (Hence it will never be greater than 100%, and it 
will never underestimate the coverage. It would be more accurately termed an 
"upper bound" on the coverage.)

And yes, that makes it fuzzy in proportion to how many lines contain multiple 
sequence points. Eliminating that fuzziness does require a vast increase in the 
complexity of the -cov implementation.

Feb 11 2018

Timothee Cour <thelastmammoth gmail.com> writes:

 -cov coverage percentage is the line coverage, not the sequence point coverage
[...]
 makes it fuzzy in proportion to how many lines contain multiple sequence points

Based on your comment I'm pretty sure you're still not getting my
point, so apologies if I was unclear and let me try to explain better:

it's not about `line coverage` vs `sequence point coverage`, as that
difference is not very large (indeed, just 'fuzzy').

It's about `line coverage` vs `branch coverage` (see exact definition
in linked article), that difference is very large in practice.

here's my example, but more concretely explained:
```
void main(int a){
if(a>0){
  statement1(); // line 3
  statement2(); // line 4
...
  statement100(); // line 102
} else{
  statement101(); // line 104
}
}
unittest{ fun(1); }
```
* line coverage is around 99%.
* sequence point coverage is also 99% (and would be close to that if
some lines had multiple statements)
* branch coverage is 50%.

This is not an artificial example, this is the common case.

What's more, code instrumentation to enable branch coverage is not
more complex to implement compared to line coverage (I would even
venture it's less complex and less costly).



On Sun, Feb 11, 2018 at 2:32 PM, Walter Bright via Digitalmars-d
<digitalmars-d puremagic.com> wrote:
 On 2/11/2018 1:55 PM, Timothee Cour wrote:
 I think you're missing my main point: it's explained here

 https://www.ncover.com/support/docs/extras/code-coverage/understanding-branch-coverage
 but the gist is that line based coverage is over-counting:
 ```
 if(A)
    // 100 lines of code
 else
    // 1 line of code
 ```
 gives a line coverage of ~ 99% vs a branch coverage of ~50%
 (assuming `else` branch never covered in unittests)

 What matters as far as bugs are concerned is that 50% of cases are
 covered. Increasing the size of the `if(A)` branch increases line
 coverage (which is irrelevant) but not branch coverage.


 I understand that point. The -cov coverage percentage is the line coverage,
 not the sequence point coverage. (Hence it will never be greater than 100%,
 and it will never underestimate the coverage. It would be more accurately
 termed an "upper bound" on the coverage.)

 And yes, that makes it fuzzy in proportion to how many lines contain
 multiple sequence points. Eliminating that fuzziness does require a vast
 increase in the complexity of the -cov implementation.

Feb 11 2018

Walter Bright <newshound2 digitalmars.com> writes:

On 2/11/2018 3:01 PM, Timothee Cour wrote:
 It's about `line coverage` vs `branch coverage` (see exact definition
 in linked article),

Ok, I see now, thanks for your patience and the clarification.


 that difference is very large in practice.

Only if one cares about the percentage metric. Pragmatically, I'm much more 
interested in exactly what was not covered rather than the percentage, and any 
good QA department should be focused on that rather than the percent metric.

BTW, when I wrote the -cov implementation, I threw in the % metric just for fun
<g>.


 What's more, code instrumentation to enable branch coverage is not
 more complex to implement compared to line coverage (I would even
 venture it's less complex and less costly).

I'm not convinced of that, as you've still got the:

    if (a) b;

problem. Furthermore, in the first link they alluded to the thrown exception 
problem, with some gobbledy-gook answer on how they dealt with it (they
punted). 
Sequence point coverage doesn't have that problem.

Next, in the python linked example, they found it necessary to add a bunch of 
pragma's to give needed information to the branch counter. I have a hard time 
seeing anyone using that unless they're forced to by management. The idea with 
dmd's coverage is you type -cov and you get a coverage report. Done.

The coverage report is a simple text file that is both readable by itself and 
trivially parse-able with a program for those who really want a GUI
presentation 
of the results, or who want to integrate it with github (as has been done for 
us, yay!).

(Back in the olden daze, a coverage analyzer was a separate product that came 
with a 200 page manual. Who bought it? Managers. Who used it? Nobody. You'd see 
the manual on programmers' shelves, still in its shrink wrap.)

---

I'm actually glad to see people complain about -cov and want to improve it. It 
shows it is successful - people are using it! As Bjarne Stroustrup once said, 
there are two kinds of products - ones people complain about, and ones people 
don't use.

Feb 11 2018

Johan Engelen <j j.nl> writes:

On Monday, 5 February 2018 at 19:32:37 UTC, Timothee Cour wrote:
 just filed https://issues.dlang.org/show_bug.cgi?id=18377:

If I may add something semi-relevant to the current discussion, 
have a look here:
https://clang.llvm.org/docs/SourceBasedCodeCoverage.html

It is non-trivial to add llvm-cov support to LDC, but half the 
work is already done in LDC [*].
The current instrumentation implementation in Clang and LDC for 
PGO/coverage only instruments branch points, and is therefore 
faster than DMD's `-cov` (also present in LDC) but less precise: 
it assumes an exception-less execution flow. It's however quite 
easy to add precision and deal with exception-flow, at the cost 
of execution speed of course. It'll be interesting to investigate 
the cost/benefit! Let me know if you're interested in working on 
it.

A copy from Clang's documentation page, from which some cool 
ideas/capabilities can be glanced (needs monospaced font):
```
     1|   20|#define BAR(x) ((x) || (x))
                            ^20     ^2
     2|    2|template <typename T> void foo(T x) {
     3|   22|  for (unsigned I = 0; I < 10; ++I) { BAR(I); }
                                    ^22     ^20  ^20^20
     4|    2|}
------------------
| void foo<int>(int):
|      2|    1|template <typename T> void foo(T x) {
|      3|   11|  for (unsigned I = 0; I < 10; ++I) { BAR(I); }
|                                     ^11     ^10  ^10^10
|      4|    1|}
------------------
| void foo<float>(int):
|      2|    1|template <typename T> void foo(T x) {
|      3|   11|  for (unsigned I = 0; I < 10; ++I) { BAR(I); }
|                                     ^11     ^10  ^10^10
|      4|    1|}
------------------
```

cheers,
   Johan

[*] PGO instrumentation is the same, I can give some guidance to 
anybody who wants to work on it.

Feb 12 2018

D Programming

C/C++ Programming

Other

digitalmars.D - -cov LOC is inadequate for 1 liner branching;