www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - help design a query interface for DStress' "bad apple" database

reply Thomas Kuehne <thomas-dloop kuehne.cn> writes:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

DStress now stores unexpected testcase results along with compiler
messages and the offending source code in a MySQL database. What would
be the best query interface(s)?

The three tables are:
#
# Message(
#	id smallint(2),
#	message text
# )
#
# Testcase(
#	name char(40),
#	source blob,
#	id smallint(2),
#	type
#	enum('compile','complex','nocompile','norun','run','undefined')
# )
#
# Result (
#	testcase smallint(2) unsigned,
#	message smallint(2) unsigned,
#	result enum('FAIL','XPASS','ERROR'),
#	options enum('','-g','-inline','-fPIC', ... )
# )
#

Sample data can be found here: (167K)
http://dstress.kuehne.cn/raw_results/dmd-0.175.sql.gz 

The available tools are MySQL 4.1.21 and PHP4.

Thomas


-----BEGIN PGP SIGNATURE-----

iD8DBQFFcNt4LK5blCcjpWoRAqsNAJ9uisVpy76/cVAiHp6dQYRk4BPdhQCgkFzY
N5oGueiMhvMyq2kFQJq8xIY=
=8GNb
-----END PGP SIGNATURE-----
Dec 01 2006
parent reply Georg Wrede <georg.wrede nospam.org> writes:
Thomas Kuehne wrote:
 -----BEGIN PGP SIGNED MESSAGE-----
 Hash: SHA1
 
 DStress now stores unexpected testcase results along with compiler
 messages and the offending source code in a MySQL database. What would
 be the best query interface(s)?

Is that like "what technology to use" or "how should the web page look like, i.e. what fields should there be for the user to fill"? --- More important would be to know _what_things_ users want to find there. In other words, find out the *use*cases*, and your answers will present themselves automatically. --- Then again, it is natural that the test cases are stored in a relational database. There's hardly any other way to manage them conveniently. But that does not mean that the user wants to interact with the database at all. For example, it may well be that there are only two kinds of retrievals: the first kind is a person who'se seen an article about a specific test case, and he wants to download only that one. And the second kind being someone who wants to help fixind D bugs, and therefore he wants to download all current unexpected results for the specific OS/compiler combination he happens to own. IF THIS REALLY is the case, then we don't need the DB UI at all. Simply have the (is it 6 different?) OS/compiler bug sets as downloadable zips, on the downlioads page. As for the individual cases, they're all small, and the most convenient way to get them would be if you have a page where they are listed, and clicking a link brings the source code right on your screen, where you can look at it, and if it looks like what you want, then "save-as" on your own computer. (I have suggested this to you some 4-5 times during the past 5 years.) --- Of course I might be all wrong, and we do need a user interface that lets the user choose "all odd-numbered test cases either for Mac with GDC or Linux with DMD, submitted third quarter 1995, during West Coast office hours, and being more than 1k in length". Even if you do create this DB interface, I think the most popular downloads will still be the all-for-one-arch and the single case browsing+save-as.
Dec 01 2006
parent reply Thomas Kuehne <thomas-dloop kuehne.cn> writes:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Georg Wrede schrieb am 2006-12-02:
 Thomas Kuehne wrote:
 
 DStress now stores unexpected testcase results along with compiler
 messages and the offending source code in a MySQL database. What would
 be the best query interface(s)?

Is that like "what technology to use" or "how should the web page look like, i.e. what fields should there be for the user to fill"? --- More important would be to know _what_things_ users want to find there. In other words, find out the *use*cases*, and your answers will present themselves automatically.

That's exactly what I'm trying to do. One use case I'm aware of - and am going to implement - is searching for all testcases that produce a certain compiler message. This should help to keep the count of duplicate bug reports with only marginally different code low.
 For example, it may well be that there are only two kinds of retrievals: 
 the first kind is a person who'se seen an article about a specific test 
 case, and he wants to download only that one. And the second kind being 
 someone who wants to help fixind D bugs, and therefore he wants to 
 download all current unexpected results for the specific OS/compiler 
 combination he happens to own.

 IF THIS REALLY is the case, then we don't need the DB UI at all. Simply 
 have the (is it 6 different?) OS/compiler bug sets as downloadable zips, 
 on the downlioads page.

That is a good idea.
 As for the individual cases, they're all small, and the most convenient 
 way to get them would be if you have a page where they are listed, and 
 clicking a link brings the source code right on your screen, where you 
 can look at it, and if it looks like what you want, then "save-as" on 
 your own computer. (I have suggested this to you some 4-5 times during 
 the past 5 years.)

The problem are misbehaving crawling bots repeatedly downloading some parts of the site if the files have a common mime type like text/plain instead of a uncommon on like text/x-dsrc and thereby sucking GBs. I'm currently preparing to move to a hoster with more fine grained controls. Let's wait and see if the bots can be kept below a certain limit, if so I'll enable the "plain view" feature.
 Of course I might be all wrong, and we do need a user interface that 
 lets the user choose "all odd-numbered test cases either for Mac with 
 GDC or Linux with DMD, submitted third quarter 1995, during West Coast 
 office hours, and being more than 1k in length".

 Even if you do create this DB interface, I think the most popular 
 downloads will still be the all-for-one-arch and the single case 
 browsing+save-as.

Thanks for your ideas. Thomas -----BEGIN PGP SIGNATURE----- iD8DBQFFcpLwLK5blCcjpWoRArvTAJ9BC1NM0H/M20ySo7KZvpYyxPeFUQCglDwH ud7yVpbS1pIpJpq8EIAYdOY= =kgVC -----END PGP SIGNATURE-----
Dec 03 2006
parent reply Kirk McDonald <kirklin.mcdonald gmail.com> writes:
Thomas Kuehne wrote:
 The problem are misbehaving crawling bots repeatedly downloading some
 parts of the site if the files have a common mime type like text/plain
 instead of a uncommon on like text/x-dsrc and thereby sucking GBs.
 I'm currently preparing to move to a hoster with more fine grained controls.
 Let's wait and see if the bots can be kept below a certain limit, if
 so I'll enable the "plain view" feature.
 

No robots.txt? Or is that what you meant by "misbehaving"? http://en.wikipedia.org/wiki/Robots.txt -- Kirk McDonald Pyd: Wrapping Python with D http://pyd.dsource.org
Dec 03 2006
parent Thomas Kuehne <thomas-dloop kuehne.cn> writes:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Kirk McDonald schrieb am 2006-12-03:
 Thomas Kuehne wrote:
 The problem are misbehaving crawling bots repeatedly downloading some
 parts of the site if the files have a common mime type like text/plain
 instead of a uncommon on like text/x-dsrc and thereby sucking GBs.
 I'm currently preparing to move to a hoster with more fine grained controls.
 Let's wait and see if the bots can be kept below a certain limit, if
 so I'll enable the "plain view" feature.
 

No robots.txt? Or is that what you meant by "misbehaving"? http://en.wikipedia.org/wiki/Robots.txt

Ignoring robots.txt and re-requesting the same files over and over again. I've since then disabled directory listings and use the robots.txt only to identify smart robots (those that handle redirects of robots.txt correctly). Thomas -----BEGIN PGP SIGNATURE----- iD8DBQFFc1i9LK5blCcjpWoRAuTiAKCVLX5EwW6GSgaIlqCYCeWJ4bnNsQCgorHx tv9ieEA4O7pgHbOgpWI9E5Q= =tnfp -----END PGP SIGNATURE-----
Dec 03 2006