digitalmars.D - Volunteer for research project?

Brad Roberts (9/9) Feb 20 2013 Would any of you be interested in helping out (read that as "doing") a r...

Maxim Fomin (9/24) Feb 21 2013 It sounds interesting, but what are you expecting to found? And

H. S. Teoh (10/37) Feb 21 2013 I would think he's referring to issues that are filed in the bugtracker.

Maxim Fomin (14/26) Feb 21 2013 This is also obvious. The question is what to do with such

Brad Roberts (14/54) Feb 21 2013 Pretty much that. (Nearly) every bug comes with a test case already. T...

Brad Roberts <braddr puremagic.com> writes:

Would any of you be interested in helping out (read that as "doing") a research
/ data mining project for us?  I'd love
to take all of the regressions this year (or for the last year, or whatever
period of time can be reasonably
accomplished) and track them back to which commit introduced each of them
(already done for some of them).  From there,
I'd like to see what sort of correlations can be found.  Is there a particular
area of code that's responsible for them.
 Is there a particular feature (spread across a lot of files, maybe) that's
responsible.  Etc.

Maybe it's all over the map.  Maybe it will highlight one or a few areas to
take a harder look at.

Anyone interested?

Thanks,
Brad

Feb 20 2013

"Maxim Fomin" <maxim maxim-fomin.ru> writes:

On Thursday, 21 February 2013 at 07:03:08 UTC, Brad Roberts wrote:
 Would any of you be interested in helping out (read that as 
 "doing") a research / data mining project for us?  I'd love
 to take all of the regressions this year (or for the last year, 
 or whatever period of time can be reasonably
 accomplished) and track them back to which commit introduced 
 each of them (already done for some of them).  From there,
 I'd like to see what sort of correlations can be found.  Is 
 there a particular area of code that's responsible for them.
  Is there a particular feature (spread across a lot of files, 
 maybe) that's responsible.  Etc.

 Maybe it's all over the map.  Maybe it will highlight one or a 
 few areas to take a harder look at.

 Anyone interested?

 Thanks,
 Brad

It sounds interesting, but what are you expecting to found? And 
how much are you sure you can found something? I would expect 
that often code which fixes some feature breaks the same feature 
in another aspect of functioning which is quite obvious. 
Sometimes one code relies implicitly on functioning of other 
code, so when you change the the latter, the former stops working 
correctly. You provide example with spreading across several 
files - how does knowing this helps in reducing regressions?

Feb 21 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Fri, Feb 22, 2013 at 06:51:53AM +0100, Maxim Fomin wrote:
 On Thursday, 21 February 2013 at 07:03:08 UTC, Brad Roberts wrote:
Would any of you be interested in helping out (read that as "doing")
a research / data mining project for us?  I'd love to take all of the
regressions this year (or for the last year, or whatever period of
time can be reasonably accomplished) and track them back to which
commit introduced each of them (already done for some of them).  From
there, I'd like to see what sort of correlations can be found.  Is
there a particular area of code that's responsible for them.  Is
there a particular feature (spread across a lot of files, maybe)
that's responsible.  Etc.

Maybe it's all over the map.  Maybe it will highlight one or a few
areas to take a harder look at.

Anyone interested?

Thanks,
Brad

 
 It sounds interesting, but what are you expecting to found? And how
 much are you sure you can found something? I would expect that often
 code which fixes some feature breaks the same feature in another
 aspect of functioning which is quite obvious. Sometimes one code
 relies implicitly on functioning of other code, so when you change the
 the latter, the former stops working correctly. You provide example
 with spreading across several files - how does knowing this helps in
 reducing regressions?

I would think he's referring to issues that are filed in the bugtracker.
Obviously, we have no way of knowing if a code change broke something if
nobody found any bug afterwards!

So I'm thinking it's probably a matter of going through the regression
bugs in the bugtracker, and making test cases to reproduce them, and
then use git bisect to figure out which commit introduced the problem.


T

-- 
Public parking: euphemism for paid parking. -- Flora

Feb 21 2013

"Maxim Fomin" <maxim maxim-fomin.ru> writes:

On Friday, 22 February 2013 at 06:02:20 UTC, H. S. Teoh wrote:
 I would think he's referring to issues that are filed in the 
 bugtracker.
 Obviously, we have no way of knowing if a code change broke 
 something if
 nobody found any bug afterwards!

Yes, it is obvious that he refers to bugzilla issues.

 So I'm thinking it's probably a matter of going through the 
 regression
 bugs in the bugtracker, and making test cases to reproduce 
 them, and
 then use git bisect to figure out which commit introduced the 
 problem.


 T

This is also obvious. The question is what to do with such 
information next, how to analyze it and interpret the results.

For example http://d.puremagic.com/issues/show_bug.cgi?id=9406 
(there is commit which introduced regression). What can you infer 
from fixed regressions  
(http://d.puremagic.com/issues/buglist.cgi?query_format=advanced&bug_severity=regression&bug_status=RESOLV
D&resolution=FIXED) 
which can be useful in fighting against non-closed ones?

P.S. There is something wrong either with forum or with your 
answering. The discussion in mailbox is single piece, but in 
forum it is splitted into two threads. Posting message in one 
thread in answering to reply in another is strange. Do you use 
email for answering or forum?

Feb 21 2013

Brad Roberts <braddr puremagic.com> writes:

On 2/21/2013 10:00 PM, H. S. Teoh wrote:
On Fri, Feb 22, 2013 at 06:51:53AM +0100, Maxim Fomin wrote:
On Thursday, 21 February 2013 at 07:03:08 UTC, Brad Roberts wrote:
Would any of you be interested in helping out (read that as "doing")
a research / data mining project for us? I'd love to take all of the
regressions this year (or for the last year, or whatever period of
time can be reasonably accomplished) and track them back to which
commit introduced each of them (already done for some of them). From
there, I'd like to see what sort of correlations can be found. Is
there a particular area of code that's responsible for them. Is
there a particular feature (spread across a lot of files, maybe)
that's responsible. Etc.

Maybe it's all over the map. Maybe it will highlight one or a few
areas to take a harder look at.

Anyone interested?

Thanks,
Brad

It sounds interesting, but what are you expecting to found? And how
much are you sure you can found something? I would expect that often
code which fixes some feature breaks the same feature in another
aspect of functioning which is quite obvious. Sometimes one code
relies implicitly on functioning of other code, so when you change the
the latter, the former stops working correctly. You provide example
with spreading across several files - how does knowing this helps in
reducing regressions?

I would think he's referring to issues that are filed in the bugtracker.
Obviously, we have no way of knowing if a code change broke something if
nobody found any bug afterwards!

So I'm thinking it's probably a matter of going through the regression
bugs in the bugtracker, and making test cases to reproduce them, and
then use git bisect to figure out which commit introduced the problem.

Pretty much that. (Nearly) every bug comes with a test case already. The part
that will be work is taking that test
case and finding the exact commit that broke it. By definition, a regression
once worked and something changed that
broke it. My hope is that one or more people can spend some time going through
each regression report in bugzilla and
tracking down the exact commit for each.

What will be uncovered by the effort? Who knows. It's better to not try to
anticipate or predict since that can bias
the analysis. The entire point of the exercise is to find out. If there is
one or move obvious or detectible clusters,
that gives us some interesting data. It might well point out a part of the
code that's particularly sensitive to
change. Or is very poorly covered by the test suite. Or is flawed in some
other way. Regardless, if there are
clusters, it's worth some study and pondering to consider what can be done to
make it/them NOT hot beds of regressions.

It's a research project. It might turn out to yield nothing useful. That's
certainly a risk. I suspect it won't turn
out to be fruitless.

To seed the effort, here's all the regression bugs that have changed since the
beginning of the year:

http://d.puremagic.com/issues/buglist.cgi?chfieldto=Now&query_format=advanced&chfieldfrom=2013-01-01&bug_severity=regression&bug_status=UNCONFIRMED&bug_status=NEW&bug_status=ASSIGNED&bug_status=REOPENED&bug_status=RESOLVED&bug_status=VERIFIED&bug_status=CLOSED

Feb 21 2013

D Programming

C/C++ Programming

Other

digitalmars.D - Volunteer for research project?