www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Bug Prediction at Google

reply Robert Clipsham <robert octarineparrot.com> writes:
I just read this pretty interesting article on the Google Engineering 
Tools website, thought it might interest some people here:

http://google-engtools.blogspot.com/2011/12/bug-prediction-at-google.html

( http://goo.gl/2O6YT <= a short link in case the above one gets wrapped)

It basically describes a new process in place at Google whereby each 
file within a project is assigned a rating saying how likely the given 
file is to have a bug in it compared to everything else. This can be 
used by code reviewers so they can take extra care when reviewing 
certain changes.

I really think github needs built in review tools (more advanced than 
just pull requests) to allow things like the auto-tester to be run, or 
algorithms like this to be used for manual review and so on.

-- 
Robert
http://octarineparrot.com/
Dec 15 2011
next sibling parent Andrew Wiley <wiley.andrew.j gmail.com> writes:
On Thu, Dec 15, 2011 at 6:04 PM, Robert Clipsham
<robert octarineparrot.com> wrote:
 I just read this pretty interesting article on the Google Engineering Tools
 website, thought it might interest some people here:

 http://google-engtools.blogspot.com/2011/12/bug-prediction-at-google.html

 ( http://goo.gl/2O6YT <= a short link in case the above one gets wrapped)

 It basically describes a new process in place at Google whereby each file
 within a project is assigned a rating saying how likely the given file is to
 have a bug in it compared to everything else. This can be used by code
 reviewers so they can take extra care when reviewing certain changes.

 I really think github needs built in review tools (more advanced than just
 pull requests) to allow things like the auto-tester to be run, or algorithms
 like this to be used for manual review and so on.
Well, Github does have an API to allow that sort of thing to happen. In theory, a bot could examine a pull request, merge it and run tests, and post back the results.
Dec 15 2011
prev sibling next sibling parent Brad Roberts <braddr puremagic.com> writes:
On Thu, 15 Dec 2011, Andrew Wiley wrote:

 On Thu, Dec 15, 2011 at 6:04 PM, Robert Clipsham
 <robert octarineparrot.com> wrote:
 I really think github needs built in review tools (more advanced than just
 pull requests) to allow things like the auto-tester to be run, or algorithms
 like this to be used for manual review and so on.
Well, Github does have an API to allow that sort of thing to happen. In theory, a bot could examine a pull request, merge it and run tests, and post back the results.
I'm 90% done with adding pull testing to the existing auto-tester fleet. It's based on the work that Daniel did. The basic overview: server: every 10 minutes, check github for changes to pull requests (if they had notifications for pull changes, I'd use it instead) every time a trunk commit notification is received, check github for changes to pull requests client: forever { check for trunk build if yes { build trunk; continue } check for pull build if yes { build pull; continue } sleep 1m } Left to do: 1) deploy changes to the tester hosts (it's on 2 already) 2) finish the ui 3) trigger pull rebuilds when trunk is updated 4) add back in support for related pull requests (ie, two pulls that separately fail but together succeed) 5) consider adding updating the pull request on github with tester results. This one needs to be very carefully done to not spam the report every time a build fails or succeeds. 6) update the auto-tester grease monkey script to integrate the pull tester results with github's ui. I'll hopefully finish 1 and 2 tonight. I can do 3 manually until it's automated. I'm not sure about the ordering of 4-6. They're nice to haves rather than must haves. All these extra builds are going to cost a lot of time. There's about 100 open pull requests right now. The fastest runs are on the order of 10 minutes. That's 6 per hour or roughly 17 hours. The slowest are closer to an hour. So, obviously there's some growing pains to deal with. I'll probably add a way for github committers to prioritize pull requests so they build first. Luckily this stuff is trivial to throw hardware at.. it's super parallelizeable. Also the hardware I have for those long runs is super old. I think the freebsd/32 box is p4 era box. The win/32 is my asus eee atom based netbook. If anyone wants to volunteer build hardware, particularly for the non-linux platforms, please contact me via email (let's not clutter up the newsgroup with that chatter). Requirements: I need to be able to access it remotely. It needs to be have reliable connectivity (bandwidth is pretty much a non-issue.. it doesn't need much at all). It needs to be hardware you're willing to have hammered fairly hard at random times. I'll almost certainly write some code during the holiday to fire up and use ec2 nodes for windows and linux builds. With the application of just a little money, all of those runs could be done fully parallel.. which would just be sweet to see. Ok, I admit it.. I love working at amazon on ec2.. and I'm happy to finally have a project that could actually use it. Later, Brad
Dec 15 2011
prev sibling next sibling parent Brad Anderson <eco gnuk.net> writes:
On Thu, Dec 15, 2011 at 6:43 PM, Brad Roberts <braddr puremagic.com> wrote:

 On Thu, 15 Dec 2011, Andrew Wiley wrote:

 On Thu, Dec 15, 2011 at 6:04 PM, Robert Clipsham
 <robert octarineparrot.com> wrote:
 I really think github needs built in review tools (more advanced than
just
 pull requests) to allow things like the auto-tester to be run, or
algorithms
 like this to be used for manual review and so on.
Well, Github does have an API to allow that sort of thing to happen. In theory, a bot could examine a pull request, merge it and run tests, and post back the results.
I'm 90% done with adding pull testing to the existing auto-tester fleet. It's based on the work that Daniel did. The basic overview: server: every 10 minutes, check github for changes to pull requests (if they had notifications for pull changes, I'd use it instead) every time a trunk commit notification is received, check github for changes to pull requests client: forever { check for trunk build if yes { build trunk; continue } check for pull build if yes { build pull; continue } sleep 1m } Left to do: 1) deploy changes to the tester hosts (it's on 2 already) 2) finish the ui 3) trigger pull rebuilds when trunk is updated 4) add back in support for related pull requests (ie, two pulls that separately fail but together succeed) 5) consider adding updating the pull request on github with tester results. This one needs to be very carefully done to not spam the report every time a build fails or succeeds. 6) update the auto-tester grease monkey script to integrate the pull tester results with github's ui. I'll hopefully finish 1 and 2 tonight. I can do 3 manually until it's automated. I'm not sure about the ordering of 4-6. They're nice to haves rather than must haves. All these extra builds are going to cost a lot of time. There's about 100 open pull requests right now. The fastest runs are on the order of 10 minutes. That's 6 per hour or roughly 17 hours. The slowest are closer to an hour. So, obviously there's some growing pains to deal with. I'll probably add a way for github committers to prioritize pull requests so they build first. Luckily this stuff is trivial to throw hardware at.. it's super parallelizeable. Also the hardware I have for those long runs is super old. I think the freebsd/32 box is p4 era box. The win/32 is my asus eee atom based netbook. If anyone wants to volunteer build hardware, particularly for the non-linux platforms, please contact me via email (let's not clutter up the newsgroup with that chatter). Requirements: I need to be able to access it remotely. It needs to be have reliable connectivity (bandwidth is pretty much a non-issue.. it doesn't need much at all). It needs to be hardware you're willing to have hammered fairly hard at random times. I'll almost certainly write some code during the holiday to fire up and use ec2 nodes for windows and linux builds. With the application of just a little money, all of those runs could be done fully parallel.. which would just be sweet to see. Ok, I admit it.. I love working at amazon on ec2.. and I'm happy to finally have a project that could actually use it. Later, Brad
 With the application of just a little money
Out of curiosity, how much is a little?
Dec 16 2011
prev sibling next sibling parent reply Brad Roberts <braddr puremagic.com> writes:
On 12/16/2011 1:29 PM, Brad Anderson wrote:
 On Thu, Dec 15, 2011 at 6:43 PM, Brad Roberts <braddr puremagic.com
<mailto:braddr puremagic.com>> wrote:
 
 
     Left to do:
 
      1) deploy changes to the tester hosts (it's on 2 already)
done
      2) finish the ui
very ugly but minimally functional: http://d.puremagic.com/test-results/pulls.ghtml
      3) trigger pull rebuilds when trunk is updated
partly implemented, but not being done yet
      4) add back in support for related pull requests (ie, two pulls that
         separately fail but together succeed)
 
      5) consider adding updating the pull request on github with tester
         results.  This one needs to be very carefully done to not spam the
         report every time a build fails or succeeds.
 
      6) update the auto-tester grease monkey script to integrate the pull
         tester results with github's ui.
 
     I'll hopefully finish 1 and 2 tonight.  I can do 3 manually until it's
     automated.  I'm not sure about the ordering of 4-6.  They're nice to haves
     rather than must haves.
 
     All these extra builds are going to cost a lot of time.  There's about 100
     open pull requests right now.  The fastest runs are on the order of 10
     minutes.  That's 6 per hour or roughly 17 hours.  The slowest are closer
     to an hour.  So, obviously there's some growing pains to deal with.  I'll
     probably add a way for github committers to prioritize pull requests so
     they build first.
 
     Luckily this stuff is trivial to throw hardware at.. it's super
     parallelizeable.  Also the hardware I have for those long runs is super
     old.  I think the freebsd/32 box is p4 era box.  The win/32 is my asus eee
     atom based netbook.
 
     If anyone wants to volunteer build hardware, particularly for the
     non-linux platforms, please contact me via email (let's not clutter up the
     newsgroup with that chatter).
 
     Requirements:
      I need to be able to access it remotely.
 
      It needs to be have reliable connectivity (bandwidth is pretty much a
      non-issue.. it doesn't need much at all).
 
      It needs to be hardware you're willing to have hammered fairly hard at
      random times.
 
     I'll almost certainly write some code during the holiday to fire up and
     use ec2 nodes for windows and linux builds.  With the application of just
     a little money, all of those runs could be done fully parallel.. which
     would just be sweet to see.  Ok, I admit it.. I love working at amazon on
     ec2.. and I'm happy to finally have a project that could actually use it.
 
     Later,
     Brad
 
 
 With the application of just a little money
Out of curiosity, how much is a little?
I'll need to experiment. It's the kind of thing that the more money thrown at the problem the faster the builds could be churned through. The limit of that would be one box per platform per pull request. A silly (and not possible due to ec2 not supporting some platforms, such as osx) extreme though. One c1.medium running 24x7 at the current spot price is about $29/month. Spot is perfect because this is a totally interruptable / resumable process. I'll need to do a little work to make it resume an in-flight test run, but it's not hard at all to do. After watching the build's slowly churn through the first round of build requests, it's clear to me that one of the areas I'll need to invest a good bit more time is in scheduling of pulls. Right now it's pseudo-random (an artifact of how the data is stored in a hash table). Later, Brad
Dec 16 2011
parent reply Robert Clipsham <robert octarineparrot.com> writes:
On 17/12/2011 06:40, Brad Roberts wrote:
 On 12/16/2011 1:29 PM, Brad Anderson wrote:
 On Thu, Dec 15, 2011 at 6:43 PM, Brad
Roberts<braddr puremagic.com<mailto:braddr puremagic.com>>  wrote:


      Left to do:

       1) deploy changes to the tester hosts (it's on 2 already)
done
       2) finish the ui
very ugly but minimally functional: http://d.puremagic.com/test-results/pulls.ghtml
       3) trigger pull rebuilds when trunk is updated
partly implemented, but not being done yet
Idea: I noticed most pull requests were failing when I looked at it, due to the main build failing - that's a lot of wasted computing time. Perhaps it would be a good idea to refuse to test pulls if dmd HEAD isn't compiling? This would be problematic for the 1/100 pull requests designed to fix this, but would save a lot of testing. An alternative method could be to test all of them, but if the pull request previously passed, then dmd HEAD broke, then the pull broke, stop testing until dmd HEAD is fixed. -- Robert http://octarineparrot.com/
Dec 17 2011
next sibling parent Brad Roberts <braddr puremagic.com> writes:
On 12/17/2011 4:56 AM, Robert Clipsham wrote:
 On 17/12/2011 06:40, Brad Roberts wrote:
 On 12/16/2011 1:29 PM, Brad Anderson wrote:
 On Thu, Dec 15, 2011 at 6:43 PM, Brad
Roberts<braddr puremagic.com<mailto:braddr puremagic.com>>  wrote:


      Left to do:

       1) deploy changes to the tester hosts (it's on 2 already)
done
       2) finish the ui
very ugly but minimally functional: http://d.puremagic.com/test-results/pulls.ghtml
       3) trigger pull rebuilds when trunk is updated
partly implemented, but not being done yet
Idea: I noticed most pull requests were failing when I looked at it, due to the main build failing - that's a lot of wasted computing time. Perhaps it would be a good idea to refuse to test pulls if dmd HEAD isn't compiling? This would be problematic for the 1/100 pull requests designed to fix this, but would save a lot of testing. An alternative method could be to test all of them, but if the pull request previously passed, then dmd HEAD broke, then the pull broke, stop testing until dmd HEAD is fixed.
Yeah. I know I need to do something in that space and just haven't yet. This whole thing is only a few evenings old and is just now starting to really work. I'm still focused on making sure it's grossly functional enough to be useful. Optimizations and polish will wait a little longer. Thanks, Brad
Dec 17 2011
prev sibling next sibling parent "Martin Nowak" <dawg dawgfoto.de> writes:
On Sat, 17 Dec 2011 20:54:24 +0100, Brad Roberts <braddr puremagic.com>  
wrote:

 On 12/17/2011 4:56 AM, Robert Clipsham wrote:
 On 17/12/2011 06:40, Brad Roberts wrote:
 On 12/16/2011 1:29 PM, Brad Anderson wrote:
 On Thu, Dec 15, 2011 at 6:43 PM, Brad  
 Roberts<braddr puremagic.com<mailto:braddr puremagic.com>>  wrote:


      Left to do:

       1) deploy changes to the tester hosts (it's on 2 already)
done
       2) finish the ui
very ugly but minimally functional: http://d.puremagic.com/test-results/pulls.ghtml
       3) trigger pull rebuilds when trunk is updated
partly implemented, but not being done yet
Idea: I noticed most pull requests were failing when I looked at it, due to the main build failing - that's a lot of wasted computing time. Perhaps it would be a good idea to refuse to test pulls if dmd HEAD isn't compiling? This would be problematic for the 1/100 pull requests designed to fix this, but would save a lot of testing. An alternative method could be to test all of them, but if the pull request previously passed, then dmd HEAD broke, then the pull broke, stop testing until dmd HEAD is fixed.
Yeah. I know I need to do something in that space and just haven't yet. This whole thing is only a few evenings old and is just now starting to really work. I'm still focused on making sure it's grossly functional enough to be useful. Optimizations and polish will wait a little longer. Thanks, Brad
Another optimization idea. Put pull request that fail to merge on an inactive list, send a comment to github and wait until the submitter does something about them. martin
Dec 19 2011
prev sibling parent Brad Roberts <braddr puremagic.com> writes:
On 12/19/2011 4:05 AM, Martin Nowak wrote:
 On Sat, 17 Dec 2011 20:54:24 +0100, Brad Roberts <braddr puremagic.com> wrote:
 
 On 12/17/2011 4:56 AM, Robert Clipsham wrote:
 On 17/12/2011 06:40, Brad Roberts wrote:
 On 12/16/2011 1:29 PM, Brad Anderson wrote:
 On Thu, Dec 15, 2011 at 6:43 PM, Brad
Roberts<braddr puremagic.com<mailto:braddr puremagic.com>>  wrote:


      Left to do:

       1) deploy changes to the tester hosts (it's on 2 already)
done
       2) finish the ui
very ugly but minimally functional: http://d.puremagic.com/test-results/pulls.ghtml
       3) trigger pull rebuilds when trunk is updated
partly implemented, but not being done yet
Idea: I noticed most pull requests were failing when I looked at it, due to the main build failing - that's a lot of wasted computing time. Perhaps it would be a good idea to refuse to test pulls if dmd HEAD isn't compiling? This would be problematic for the 1/100 pull requests designed to fix this, but would save a lot of testing. An alternative method could be to test all of them, but if the pull request previously passed, then dmd HEAD broke, then the pull broke, stop testing until dmd HEAD is fixed.
Yeah. I know I need to do something in that space and just haven't yet. This whole thing is only a few evenings old and is just now starting to really work. I'm still focused on making sure it's grossly functional enough to be useful. Optimizations and polish will wait a little longer. Thanks, Brad
Another optimization idea. Put pull request that fail to merge on an inactive list, send a comment to github and wait until the submitter does something about them. martin
Way ahead of you, but low priority on it from a throughput standpoint. Those take almost no time to process. The benefit there is in getting the notification back to the pull submitter. But that's true for all failures. :) The biggest win I can think of right now is what I'll work on next: if one platform has failed the build, skip it unless there's nothing else to do. With that, only one build is wasted, and only on the platform that's making the fastest progress anyway. Later, Brad
Dec 19 2011
prev sibling parent Brad Anderson <eco gnuk.net> writes:
On Fri, Dec 16, 2011 at 11:40 PM, Brad Roberts <braddr puremagic.com> wrote:

 On 12/16/2011 1:29 PM, Brad Anderson wrote:
 On Thu, Dec 15, 2011 at 6:43 PM, Brad Roberts <braddr puremagic.com<mailto:
braddr puremagic.com>> wrote:
     Left to do:

      1) deploy changes to the tester hosts (it's on 2 already)
done
      2) finish the ui
very ugly but minimally functional: http://d.puremagic.com/test-results/pulls.ghtml
      3) trigger pull rebuilds when trunk is updated
partly implemented, but not being done yet
      4) add back in support for related pull requests (ie, two pulls that
         separately fail but together succeed)

      5) consider adding updating the pull request on github with tester
         results.  This one needs to be very carefully done to not spam
the
         report every time a build fails or succeeds.

      6) update the auto-tester grease monkey script to integrate the pull
         tester results with github's ui.

     I'll hopefully finish 1 and 2 tonight.  I can do 3 manually until
it's
     automated.  I'm not sure about the ordering of 4-6.  They're nice to
haves
     rather than must haves.

     All these extra builds are going to cost a lot of time.  There's
about 100
     open pull requests right now.  The fastest runs are on the order of
10
     minutes.  That's 6 per hour or roughly 17 hours.  The slowest are
closer
     to an hour.  So, obviously there's some growing pains to deal with.
I'll
     probably add a way for github committers to prioritize pull requests
so
     they build first.

     Luckily this stuff is trivial to throw hardware at.. it's super
     parallelizeable.  Also the hardware I have for those long runs is
super
     old.  I think the freebsd/32 box is p4 era box.  The win/32 is my
asus eee
     atom based netbook.

     If anyone wants to volunteer build hardware, particularly for the
     non-linux platforms, please contact me via email (let's not clutter
up the
     newsgroup with that chatter).

     Requirements:
      I need to be able to access it remotely.

      It needs to be have reliable connectivity (bandwidth is pretty much
a
      non-issue.. it doesn't need much at all).

      It needs to be hardware you're willing to have hammered fairly hard
at
      random times.

     I'll almost certainly write some code during the holiday to fire up
and
     use ec2 nodes for windows and linux builds.  With the application of
just
     a little money, all of those runs could be done fully parallel..
which
     would just be sweet to see.  Ok, I admit it.. I love working at
amazon on
     ec2.. and I'm happy to finally have a project that could actually
use it.
     Later,
     Brad


 With the application of just a little money
Out of curiosity, how much is a little?
I'll need to experiment. It's the kind of thing that the more money thrown at the problem the faster the builds could be churned through. The limit of that would be one box per platform per pull request. A silly (and not possible due to ec2 not supporting some platforms, such as osx) extreme though. One c1.medium running 24x7 at the current spot price is about $29/month. Spot is perfect because this is a totally interruptable / resumable process. I'll need to do a little work to make it resume an in-flight test run, but it's not hard at all to do. After watching the build's slowly churn through the first round of build requests, it's clear to me that one of the areas I'll need to invest a good bit more time is in scheduling of pulls. Right now it's pseudo-random (an artifact of how the data is stored in a hash table). Later, Brad
That seems very reasonable. It sounds like this autotester will help immensely with processing the pull request backlog. More free time for Walter and everyone else who works on the project is a great thing.
Dec 16 2011