digitalmars.D.learn - Want to help DMD bugfixing? Write a simple utility.

Don (13/13) Mar 19 2011 Here's the task:

David Nadlinger (7/12) Mar 19 2011 I realize that you asked for a very specific utility, but in several
Jonathan M Davis (7/21) Mar 19 2011 Unfortunately, to do that 100% correctly, you need to actually have a wo...

Don (7/30) Mar 19 2011 I didn't say it needs 100% accuracy. You can assume, for example, that

Jonathan M Davis (17/50) Mar 19 2011 I tried to create a similar tool before and gave up because I couldn't m...

Don (20/64) Mar 19 2011 No. All you know there's a bug that's being triggered somewhere in

Jonathan M Davis (16/87) Mar 20 2011 Hmmm. I really don't know what could be done to fix that (other than mak...

Don (7/93) Mar 20 2011 The problem is purely the large fraction of the module which is devoted

Jonathan M Davis (17/119) Mar 20 2011 Well, for the moment at least, if you remove the

Regan Heath (12/118) Mar 23 2011 I was just thinking .. if we get a list of the symbols the linker is

Jonathan M Davis (2/144) Mar 23 2011 That would require a full-blown D lexer and parser.

Regan Heath (6/17) Mar 23 2011 Yeah, I thought as much. I wonder if the new guy "Ilya" who just posted...
Kai Meyer (12/156) Mar 23 2011 Why are we talking about having to recreate a full-blown lexer and

Jonathan M Davis (34/200) Mar 23 2011 There are tasks for which you need to be able to lex and parse D code. T...

Regan Heath (7/9) Mar 25 2011 Is that last bit true? You definitely need to be able to lex it, but

spir (10/16) Mar 25 2011 At first sight, you're both wrong: you'd need to count { } levels. Also,...

Don (21/40) Mar 25 2011 Yes, exactly: you just need to lex strings (including q{}), comments

Nick Sabalausky (11/18) Mar 25 2011 No, to do it 100% reliably, you do need lexing/parsing, and also the

Alexey Prokhin (3/10) Mar 24 2011 There is a third one: http://code.google.com/p/dil/. The main page says ...

Nick Sabalausky (14/29) Mar 25 2011 The nearly-done v0.4 of my Goldie parsing system (zlib/libpng license) c...

Nick Sabalausky (9/36) Mar 25 2011 Note that this probably isn't a big of a problem as it sounds:

spir (11/18) Mar 24 2011 I fully support this. We desperately need it, I guess, working and maint...

Andrej Mitrovic (2/4) Mar 23 2011 Isn't DDMD written in D? I'm not sure about how finished it is though.

Nick Sabalausky (7/13) Mar 25 2011 I've done a little bit of playing around with DDMD for a (still only jus...

Michel Fortin (15/36) Mar 19 2011 Well, I made simple lexer for D strings, comments, identifiers and a

Kai Meyer (5/18) Mar 20 2011 Is there a copy of the official D grammar somewhere online? I wrote a

Zirneklis (3/24) Mar 20 2011 As far as I know the documentation /is/ the official grammar
Trass3r (2/5) Mar 24 2011 The official D grammar is spread among the specification.

Ary Manzana (2/15) Mar 20 2011 Can it be done in Ruby? Or you need it in D?

Simen kjaeraas (7/25) Mar 21 2011 Part of the idea was that someone use it to learn D. However, the import...

Jonathan M Davis (9/15) Mar 23 2011 Yes, but the lexer and parser in ddmd are not only GPL (which would be a...

Nick Sabalausky (15/34) Mar 25 2011 I don't know about the license issues, but I don't think the API is a bi...

Andrej Mitrovic (2/19) Mar 23 2011 I didn't even know it was GPL. It doesn't come with a license file.
Andrej Mitrovic (2/2) Mar 23 2011 What about the artistic license, the front-end can be used with that
Jonathan M Davis (6/8) Mar 23 2011 I don't know what the exact licensing situation is. However, as I unders...

Don <nospam nospam.com> writes:

Here's the task:
Given a .d source file, strip out all of the unittest {} blocks,
including everything inside them.
Strip out all comments as well.
Print out the resulting file.

Motivation: Bug reports frequently come with very large test cases.
Even ones which look small often import from Phobos.
Reducing the test case is the first step in fixing the bug, and it's 
frequently ~30% of the total time required. Stripping out the unit tests 
is the most time-consuming and error-prone part of reducing the test case.

This should be a good task if you're relatively new to D but would like 
to do something really useful.
-Don

Mar 19 2011

David Nadlinger <see klickverbot.at> writes:

On 3/20/11 1:11 AM, Don wrote:
 Here's the task:
 Given a .d source file, strip out all of the unittest {} blocks,
 including everything inside them.
 Strip out all comments as well.
 Print out the resulting file.

I realize that you asked for a very specific utility, but in several 
instances, http://delta.tigris.org/ worked fine for me for reducing 
large test cases.

Parts of it are tailored to C/C++ though, so a port/adaption for D would 
be a nice project as well.

David

Mar 19 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Saturday 19 March 2011 17:11:56 Don wrote:
 Here's the task:
 Given a .d source file, strip out all of the unittest {} blocks,
 including everything inside them.
 Strip out all comments as well.
 Print out the resulting file.
 
 Motivation: Bug reports frequently come with very large test cases.
 Even ones which look small often import from Phobos.
 Reducing the test case is the first step in fixing the bug, and it's
 frequently ~30% of the total time required. Stripping out the unit tests
 is the most time-consuming and error-prone part of reducing the test case.
 
 This should be a good task if you're relatively new to D but would like
 to do something really useful.

Unfortunately, to do that 100% correctly, you need to actually have a working D 
lexer (and possibly parser). You might be able to get something close enough to 
work in most cases, but it doesn't take all that much to throw off a basic 
implementation of this sort of thing if you don't lex/parse it with something 
which properly understands D.

- Jonathan M Davis

Mar 19 2011

Don <nospam nospam.com> writes:

Jonathan M Davis wrote:
 On Saturday 19 March 2011 17:11:56 Don wrote:
 Here's the task:
 Given a .d source file, strip out all of the unittest {} blocks,
 including everything inside them.
 Strip out all comments as well.
 Print out the resulting file.

 Motivation: Bug reports frequently come with very large test cases.
 Even ones which look small often import from Phobos.
 Reducing the test case is the first step in fixing the bug, and it's
 frequently ~30% of the total time required. Stripping out the unit tests
 is the most time-consuming and error-prone part of reducing the test case.

 This should be a good task if you're relatively new to D but would like
 to do something really useful.

 
 Unfortunately, to do that 100% correctly, you need to actually have a working
D 
 lexer (and possibly parser). You might be able to get something close enough
to 
 work in most cases, but it doesn't take all that much to throw off a basic 
 implementation of this sort of thing if you don't lex/parse it with something 
 which properly understands D.
 
 - Jonathan M Davis

I didn't say it needs 100% accuracy. You can assume, for example, that 
"unittest" always occurs at the start of a line. The only other things 
you need to lex are {}, string literals, and comments.

BTW, the immediate motivation for this is std.datetime in Phobos. The 
sheer number of unittests in there is an absolute catastrophe for 
tracking down bugs. It makes a tool like this MANDATORY.

Mar 19 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Saturday 19 March 2011 18:04:57 Don wrote:
 Jonathan M Davis wrote:
 On Saturday 19 March 2011 17:11:56 Don wrote:
 Here's the task:
 Given a .d source file, strip out all of the unittest {} blocks,
 including everything inside them.
 Strip out all comments as well.
 Print out the resulting file.
 
 Motivation: Bug reports frequently come with very large test cases.
 Even ones which look small often import from Phobos.
 Reducing the test case is the first step in fixing the bug, and it's
 frequently ~30% of the total time required. Stripping out the unit tests
 is the most time-consuming and error-prone part of reducing the test
 case.
 
 This should be a good task if you're relatively new to D but would like
 to do something really useful.

 
 Unfortunately, to do that 100% correctly, you need to actually have a
 working D lexer (and possibly parser). You might be able to get
 something close enough to work in most cases, but it doesn't take all
 that much to throw off a basic implementation of this sort of thing if
 you don't lex/parse it with something which properly understands D.
 
 - Jonathan M Davis

 
 I didn't say it needs 100% accuracy. You can assume, for example, that
 "unittest" always occurs at the start of a line. The only other things
 you need to lex are {}, string literals, and comments.
 
 BTW, the immediate motivation for this is std.datetime in Phobos. The
 sheer number of unittests in there is an absolute catastrophe for
 tracking down bugs. It makes a tool like this MANDATORY.

I tried to create a similar tool before and gave up because I couldn't make it 
100% accurate and was running into problems with it. If someone wants to take a 
shot at it though, that's fine.

As for the unit tests in std.datetime making it hard to track down bugs, that 
only makes sense to me if you're trying to look at the whole thing at once and 
track down a compiler bug which happens _somewhere_ in the code, but you don't 
know where. Other than a problem like that, I don't really see how the unit 
tests get in the way of tracking down bugs. Is it that you need to compile in a 
version of std.datetime which doesn't have any unit tests compiled in but you 
still need to compile with -unittest for other stuff?

I _am_ working on streamlining the unit tests in std.datetime so that they take 
up fewer lines of code without reducing how well they cover the code, so 
depending on your problem with the amount of unit test code, that could help, 
but I expect that whatever your core problem with the number of unit tests is, 
that won't fix it.

- Jonathan M Davis

Mar 19 2011

Don <nospam nospam.com> writes:

Jonathan M Davis wrote:
 On Saturday 19 March 2011 18:04:57 Don wrote:
 Jonathan M Davis wrote:
 On Saturday 19 March 2011 17:11:56 Don wrote:
 Here's the task:
 Given a .d source file, strip out all of the unittest {} blocks,
 including everything inside them.
 Strip out all comments as well.
 Print out the resulting file.

 Motivation: Bug reports frequently come with very large test cases.
 Even ones which look small often import from Phobos.
 Reducing the test case is the first step in fixing the bug, and it's
 frequently ~30% of the total time required. Stripping out the unit tests
 is the most time-consuming and error-prone part of reducing the test
 case.

 This should be a good task if you're relatively new to D but would like
 to do something really useful.

 Unfortunately, to do that 100% correctly, you need to actually have a
 working D lexer (and possibly parser). You might be able to get
 something close enough to work in most cases, but it doesn't take all
 that much to throw off a basic implementation of this sort of thing if
 you don't lex/parse it with something which properly understands D.

 - Jonathan M Davis

 I didn't say it needs 100% accuracy. You can assume, for example, that
 "unittest" always occurs at the start of a line. The only other things
 you need to lex are {}, string literals, and comments.

 BTW, the immediate motivation for this is std.datetime in Phobos. The
 sheer number of unittests in there is an absolute catastrophe for
 tracking down bugs. It makes a tool like this MANDATORY.

 
 I tried to create a similar tool before and gave up because I couldn't make it 
 100% accurate and was running into problems with it. If someone wants to take
a 
 shot at it though, that's fine.
 
 As for the unit tests in std.datetime making it hard to track down bugs, that 
 only makes sense to me if you're trying to look at the whole thing at once and 
 track down a compiler bug which happens _somewhere_ in the code, but you don't 
 know where. Other than a problem like that, I don't really see how the unit 
 tests get in the way of tracking down bugs. Is it that you need to compile in
a 
 version of std.datetime which doesn't have any unit tests compiled in but you 
 still need to compile with -unittest for other stuff?

No. All you know there's a bug that's being triggered somewhere in 
Phobos (with -unittest). It's probably not in std.datetime.
But Phobos is a horrible ball of mud where everything imports everything 
else, and std.datetime is near the centre of that ball. What you have to 
do is reduce the amount of code, and especially the number of modules, 
as rapidly as possible; this means getting rid of imports.

To do this, you need to remove large chunks of code from the files. This 
is pretty simple; comment out half of the file, if it still works, then 
delete it. Normally this works well because typically only about a dozen 
lines are actually being used. After doing this about three or four 
times it's small enough that you can usually get rid of most of the imports.
Unittests foul this up because they use functions/classes from inside 
the file.

In the case of std.datetime it's even worse because the signal-to-noise 
ratio is so incredibly poor; it's really difficult to find the few lines 
of code that are actually being used by other Phobos modules.

My experience (obviously only over the last month or so) has been that 
if the reduction of a bug is non-obvious, more than 10% of the total 
time taken to fix that bug is the time taken to cut down std.datetime.

Mar 19 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

 Jonathan M Davis wrote:
 On Saturday 19 March 2011 18:04:57 Don wrote:
 Jonathan M Davis wrote:
 On Saturday 19 March 2011 17:11:56 Don wrote:
 Here's the task:
 Given a .d source file, strip out all of the unittest {} blocks,
 including everything inside them.
 Strip out all comments as well.
 Print out the resulting file.
 
 Motivation: Bug reports frequently come with very large test cases.
 Even ones which look small often import from Phobos.
 Reducing the test case is the first step in fixing the bug, and it's
 frequently ~30% of the total time required. Stripping out the unit
 tests is the most time-consuming and error-prone part of reducing the
 test case.
 
 This should be a good task if you're relatively new to D but would
 like to do something really useful.

 
 Unfortunately, to do that 100% correctly, you need to actually have a
 working D lexer (and possibly parser). You might be able to get
 something close enough to work in most cases, but it doesn't take all
 that much to throw off a basic implementation of this sort of thing if
 you don't lex/parse it with something which properly understands D.
 
 - Jonathan M Davis

 
 I didn't say it needs 100% accuracy. You can assume, for example, that
 "unittest" always occurs at the start of a line. The only other things
 you need to lex are {}, string literals, and comments.
 
 BTW, the immediate motivation for this is std.datetime in Phobos. The
 sheer number of unittests in there is an absolute catastrophe for
 tracking down bugs. It makes a tool like this MANDATORY.

 
 I tried to create a similar tool before and gave up because I couldn't
 make it 100% accurate and was running into problems with it. If someone
 wants to take a shot at it though, that's fine.
 
 As for the unit tests in std.datetime making it hard to track down bugs,
 that only makes sense to me if you're trying to look at the whole thing
 at once and track down a compiler bug which happens _somewhere_ in the
 code, but you don't know where. Other than a problem like that, I don't
 really see how the unit tests get in the way of tracking down bugs. Is
 it that you need to compile in a version of std.datetime which doesn't
 have any unit tests compiled in but you still need to compile with
 -unittest for other stuff?

 
 No. All you know there's a bug that's being triggered somewhere in
 Phobos (with -unittest). It's probably not in std.datetime.
 But Phobos is a horrible ball of mud where everything imports everything
 else, and std.datetime is near the centre of that ball. What you have to
 do is reduce the amount of code, and especially the number of modules,
 as rapidly as possible; this means getting rid of imports.
 
 To do this, you need to remove large chunks of code from the files. This
 is pretty simple; comment out half of the file, if it still works, then
 delete it. Normally this works well because typically only about a dozen
 lines are actually being used. After doing this about three or four
 times it's small enough that you can usually get rid of most of the
 imports. Unittests foul this up because they use functions/classes from
 inside the file.
 
 In the case of std.datetime it's even worse because the signal-to-noise
 ratio is so incredibly poor; it's really difficult to find the few lines
 of code that are actually being used by other Phobos modules.
 
 My experience (obviously only over the last month or so) has been that
 if the reduction of a bug is non-obvious, more than 10% of the total
 time taken to fix that bug is the time taken to cut down std.datetime.

Hmmm. I really don't know what could be done to fix that (other than making it 
easier to rip out the unittest blocks). And enough of std.datetime depends on 
other parts of std.datetime that trimming it down isn't (and can't be) exactly 
easy. In general, SysTime is the most likely type to be used, and it depends 
on Date, TimeOfDay, and DateTime, and all 4 of those depend on most of the 
free functions in the module. It's not exactly designed in a manner which 
allows you to cut out large chunks and still have it compile. And I don't 
think that it _could_ be designed that way and still have the functionality 
that it has.

I guess that this sort of problem is one that would pop up mainly when dealing 
with compiler bugs. I have a hard time seeing it popping up with your typical 
bug in Phobos itself. So, I guess that this is the sort of thing that you'd 
run into and I likely wouldn't.

I really don't know how the situation could be improved though other than 
making it easier to cut out the unit tests.

- Jonathan M Davis

Mar 20 2011

Don <nospam nospam.com> writes:

Jonathan M Davis wrote:
 Jonathan M Davis wrote:
 On Saturday 19 March 2011 18:04:57 Don wrote:
 Jonathan M Davis wrote:
 On Saturday 19 March 2011 17:11:56 Don wrote:
 Here's the task:
 Given a .d source file, strip out all of the unittest {} blocks,
 including everything inside them.
 Strip out all comments as well.
 Print out the resulting file.

 Motivation: Bug reports frequently come with very large test cases.
 Even ones which look small often import from Phobos.
 Reducing the test case is the first step in fixing the bug, and it's
 frequently ~30% of the total time required. Stripping out the unit
 tests is the most time-consuming and error-prone part of reducing the
 test case.

 This should be a good task if you're relatively new to D but would
 like to do something really useful.

 Unfortunately, to do that 100% correctly, you need to actually have a
 working D lexer (and possibly parser). You might be able to get
 something close enough to work in most cases, but it doesn't take all
 that much to throw off a basic implementation of this sort of thing if
 you don't lex/parse it with something which properly understands D.

 - Jonathan M Davis

 I didn't say it needs 100% accuracy. You can assume, for example, that
 "unittest" always occurs at the start of a line. The only other things
 you need to lex are {}, string literals, and comments.

 BTW, the immediate motivation for this is std.datetime in Phobos. The
 sheer number of unittests in there is an absolute catastrophe for
 tracking down bugs. It makes a tool like this MANDATORY.

 I tried to create a similar tool before and gave up because I couldn't
 make it 100% accurate and was running into problems with it. If someone
 wants to take a shot at it though, that's fine.

 As for the unit tests in std.datetime making it hard to track down bugs,
 that only makes sense to me if you're trying to look at the whole thing
 at once and track down a compiler bug which happens _somewhere_ in the
 code, but you don't know where. Other than a problem like that, I don't
 really see how the unit tests get in the way of tracking down bugs. Is
 it that you need to compile in a version of std.datetime which doesn't
 have any unit tests compiled in but you still need to compile with
 -unittest for other stuff?

 No. All you know there's a bug that's being triggered somewhere in
 Phobos (with -unittest). It's probably not in std.datetime.
 But Phobos is a horrible ball of mud where everything imports everything
 else, and std.datetime is near the centre of that ball. What you have to
 do is reduce the amount of code, and especially the number of modules,
 as rapidly as possible; this means getting rid of imports.

 To do this, you need to remove large chunks of code from the files. This
 is pretty simple; comment out half of the file, if it still works, then
 delete it. Normally this works well because typically only about a dozen
 lines are actually being used. After doing this about three or four
 times it's small enough that you can usually get rid of most of the
 imports. Unittests foul this up because they use functions/classes from
 inside the file.

 In the case of std.datetime it's even worse because the signal-to-noise
 ratio is so incredibly poor; it's really difficult to find the few lines
 of code that are actually being used by other Phobos modules.

 My experience (obviously only over the last month or so) has been that
 if the reduction of a bug is non-obvious, more than 10% of the total
 time taken to fix that bug is the time taken to cut down std.datetime.

 
 Hmmm. I really don't know what could be done to fix that (other than making it 
 easier to rip out the unittest blocks). And enough of std.datetime depends on 
 other parts of std.datetime that trimming it down isn't (and can't be) exactly 
 easy. In general, SysTime is the most likely type to be used, and it depends 
 on Date, TimeOfDay, and DateTime, and all 4 of those depend on most of the 
 free functions in the module. It's not exactly designed in a manner which 
 allows you to cut out large chunks and still have it compile. And I don't 
 think that it _could_ be designed that way and still have the functionality 
 that it has.

The problem is purely the large fraction of the module which is devoted 
to unit tests. That's all.

 
 I guess that this sort of problem is one that would pop up mainly when dealing 
 with compiler bugs. I have a hard time seeing it popping up with your typical 
 bug in Phobos itself. So, I guess that this is the sort of thing that you'd 
 run into and I likely wouldn't.

Yes.

 I really don't know how the situation could be improved though other than 
 making it easier to cut out the unit tests.
 
 - Jonathan M Davis

Hence the motivation for this utility. The problem exists in all 
modules, but in std.datetime it's such an obvious time-waster that I 
can't keep ignoring it.

Mar 20 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

 Jonathan M Davis wrote:
 Jonathan M Davis wrote:
 On Saturday 19 March 2011 18:04:57 Don wrote:
 Jonathan M Davis wrote:
 On Saturday 19 March 2011 17:11:56 Don wrote:
 Here's the task:
 Given a .d source file, strip out all of the unittest {} blocks,
 including everything inside them.
 Strip out all comments as well.
 Print out the resulting file.
 
 Motivation: Bug reports frequently come with very large test cases.
 Even ones which look small often import from Phobos.
 Reducing the test case is the first step in fixing the bug, and it's
 frequently ~30% of the total time required. Stripping out the unit
 tests is the most time-consuming and error-prone part of reducing
 the test case.
 
 This should be a good task if you're relatively new to D but would
 like to do something really useful.

 
 Unfortunately, to do that 100% correctly, you need to actually have a
 working D lexer (and possibly parser). You might be able to get
 something close enough to work in most cases, but it doesn't take all
 that much to throw off a basic implementation of this sort of thing
 if you don't lex/parse it with something which properly understands
 D.
 
 - Jonathan M Davis

 
 I didn't say it needs 100% accuracy. You can assume, for example, that
 "unittest" always occurs at the start of a line. The only other things
 you need to lex are {}, string literals, and comments.
 
 BTW, the immediate motivation for this is std.datetime in Phobos. The
 sheer number of unittests in there is an absolute catastrophe for
 tracking down bugs. It makes a tool like this MANDATORY.

 
 I tried to create a similar tool before and gave up because I couldn't
 make it 100% accurate and was running into problems with it. If someone
 wants to take a shot at it though, that's fine.
 
 As for the unit tests in std.datetime making it hard to track down
 bugs, that only makes sense to me if you're trying to look at the
 whole thing at once and track down a compiler bug which happens
 _somewhere_ in the code, but you don't know where. Other than a
 problem like that, I don't really see how the unit tests get in the
 way of tracking down bugs. Is it that you need to compile in a version
 of std.datetime which doesn't have any unit tests compiled in but you
 still need to compile with -unittest for other stuff?

 
 No. All you know there's a bug that's being triggered somewhere in
 Phobos (with -unittest). It's probably not in std.datetime.
 But Phobos is a horrible ball of mud where everything imports everything
 else, and std.datetime is near the centre of that ball. What you have to
 do is reduce the amount of code, and especially the number of modules,
 as rapidly as possible; this means getting rid of imports.
 
 To do this, you need to remove large chunks of code from the files. This
 is pretty simple; comment out half of the file, if it still works, then
 delete it. Normally this works well because typically only about a dozen
 lines are actually being used. After doing this about three or four
 times it's small enough that you can usually get rid of most of the
 imports. Unittests foul this up because they use functions/classes from
 inside the file.
 
 In the case of std.datetime it's even worse because the signal-to-noise
 ratio is so incredibly poor; it's really difficult to find the few lines
 of code that are actually being used by other Phobos modules.
 
 My experience (obviously only over the last month or so) has been that
 if the reduction of a bug is non-obvious, more than 10% of the total
 time taken to fix that bug is the time taken to cut down std.datetime.

 
 Hmmm. I really don't know what could be done to fix that (other than
 making it easier to rip out the unittest blocks). And enough of
 std.datetime depends on other parts of std.datetime that trimming it
 down isn't (and can't be) exactly easy. In general, SysTime is the most
 likely type to be used, and it depends on Date, TimeOfDay, and DateTime,
 and all 4 of those depend on most of the free functions in the module.
 It's not exactly designed in a manner which allows you to cut out large
 chunks and still have it compile. And I don't think that it _could_ be
 designed that way and still have the functionality that it has.

 
 The problem is purely the large fraction of the module which is devoted
 to unit tests. That's all.
 
 I guess that this sort of problem is one that would pop up mainly when
 dealing with compiler bugs. I have a hard time seeing it popping up with
 your typical bug in Phobos itself. So, I guess that this is the sort of
 thing that you'd run into and I likely wouldn't.

 
 Yes.
 
 I really don't know how the situation could be improved though other than
 making it easier to cut out the unit tests.
 
 - Jonathan M Davis

 
 Hence the motivation for this utility. The problem exists in all
 modules, but in std.datetime it's such an obvious time-waster that I
 can't keep ignoring it.

Well, for the moment at least, if you remove the

version = testStdDateTime;
version = enableWindowsTest;

lines near the top of the file, then pretty much all of the unittest blocks 
will no longer be compiled in (there might be a couple which are still 
compiled in, but not many). So, that could help you until the utility that you 
want is done. Unfortunately, that also means that the utility will have to be 
smarter if it's going to work on std.datetime. While most of the 
version(testStdDateTime) blocks are currently _inside_ of the unittest blocks, 
as I've been adjusting the unit tests, I've been changing them to

version(testStdDateTime) unittest

because Andrei didn't like the extra vertical space used up by having separate 
blocks for the unittest and for the version. So, for instance, if the utility 
assumed that unittest was the first part of the line for a unittest block, it 
wouldn't work on std.datetime (IIRC, std.algorithm would have similar 
problems).

- Jonathan M Davis

Mar 20 2011

"Regan Heath" <regan netmail.co.nz> writes:

On Sun, 20 Mar 2011 07:50:10 -0000, Jonathan M Davis <jmdavisProg gmx.com>  
wrote:
 Jonathan M Davis wrote:
 On Saturday 19 March 2011 18:04:57 Don wrote:
 Jonathan M Davis wrote:
 On Saturday 19 March 2011 17:11:56 Don wrote:
 Here's the task:
 Given a .d source file, strip out all of the unittest {} blocks,
 including everything inside them.
 Strip out all comments as well.
 Print out the resulting file.

 Motivation: Bug reports frequently come with very large test cases.
 Even ones which look small often import from Phobos.
 Reducing the test case is the first step in fixing the bug, and  




 it's
 frequently ~30% of the total time required. Stripping out the unit
 tests is the most time-consuming and error-prone part of reducing  




 the
 test case.

 This should be a good task if you're relatively new to D but would
 like to do something really useful.

 Unfortunately, to do that 100% correctly, you need to actually have  



 a
 working D lexer (and possibly parser). You might be able to get
 something close enough to work in most cases, but it doesn't take  



 all
 that much to throw off a basic implementation of this sort of thing  



 if
 you don't lex/parse it with something which properly understands D.

 - Jonathan M Davis

 I didn't say it needs 100% accuracy. You can assume, for example,  


 that
 "unittest" always occurs at the start of a line. The only other  


 things
 you need to lex are {}, string literals, and comments.

 BTW, the immediate motivation for this is std.datetime in Phobos. The
 sheer number of unittests in there is an absolute catastrophe for
 tracking down bugs. It makes a tool like this MANDATORY.

 I tried to create a similar tool before and gave up because I couldn't
 make it 100% accurate and was running into problems with it. If  

 someone
 wants to take a shot at it though, that's fine.

 As for the unit tests in std.datetime making it hard to track down  

 bugs,
 that only makes sense to me if you're trying to look at the whole  

 thing
 at once and track down a compiler bug which happens _somewhere_ in the
 code, but you don't know where. Other than a problem like that, I  

 don't
 really see how the unit tests get in the way of tracking down bugs. Is
 it that you need to compile in a version of std.datetime which doesn't
 have any unit tests compiled in but you still need to compile with
 -unittest for other stuff?

 No. All you know there's a bug that's being triggered somewhere in
 Phobos (with -unittest). It's probably not in std.datetime.
 But Phobos is a horrible ball of mud where everything imports everything
 else, and std.datetime is near the centre of that ball. What you have to
 do is reduce the amount of code, and especially the number of modules,
 as rapidly as possible; this means getting rid of imports.

 To do this, you need to remove large chunks of code from the files. This
 is pretty simple; comment out half of the file, if it still works, then
 delete it. Normally this works well because typically only about a dozen
 lines are actually being used. After doing this about three or four
 times it's small enough that you can usually get rid of most of the
 imports. Unittests foul this up because they use functions/classes from
 inside the file.

 In the case of std.datetime it's even worse because the signal-to-noise
 ratio is so incredibly poor; it's really difficult to find the few lines
 of code that are actually being used by other Phobos modules.

 My experience (obviously only over the last month or so) has been that
 if the reduction of a bug is non-obvious, more than 10% of the total
 time taken to fix that bug is the time taken to cut down std.datetime.

 Hmmm. I really don't know what could be done to fix that (other than  
 making it
 easier to rip out the unittest blocks). And enough of std.datetime  
 depends on
 other parts of std.datetime that trimming it down isn't (and can't be)  
 exactly
 easy. In general, SysTime is the most likely type to be used, and it  
 depends
 on Date, TimeOfDay, and DateTime, and all 4 of those depend on most of  
 the
 free functions in the module. It's not exactly designed in a manner which
 allows you to cut out large chunks and still have it compile. And I don't
 think that it _could_ be designed that way and still have the  
 functionality
 that it has.

 I guess that this sort of problem is one that would pop up mainly when  
 dealing
 with compiler bugs. I have a hard time seeing it popping up with your  
 typical
 bug in Phobos itself. So, I guess that this is the sort of thing that  
 you'd
 run into and I likely wouldn't.

 I really don't know how the situation could be improved though other than
 making it easier to cut out the unit tests.

I was just thinking .. if we get a list of the symbols the linker is  
including, then write an app to take that list, and strip everything else  
out of the source .. would that work.  The Q's are how hard is it to get  
the symbols from the linker and then how hard is it to match those to  
source.  IIRC there are functions in phobos to convert to/from symbol  
names, so if the app had sufficient lexing and parsing capability it could  
match on those.

R

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

Mar 23 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

 On Sun, 20 Mar 2011 07:50:10 -0000, Jonathan M Davis <jmdavisProg gmx.com>
 
 wrote:
 Jonathan M Davis wrote:
 On Saturday 19 March 2011 18:04:57 Don wrote:
 Jonathan M Davis wrote:
 On Saturday 19 March 2011 17:11:56 Don wrote:
 Here's the task:
 Given a .d source file, strip out all of the unittest {} blocks,
 including everything inside them.
 Strip out all comments as well.
 Print out the resulting file.
 
 Motivation: Bug reports frequently come with very large test cases.
 Even ones which look small often import from Phobos.
 Reducing the test case is the first step in fixing the bug, and




 
 it's
 
 frequently ~30% of the total time required. Stripping out the unit
 tests is the most time-consuming and error-prone part of reducing




 
 the
 
 test case.
 
 This should be a good task if you're relatively new to D but would
 like to do something really useful.

 
 Unfortunately, to do that 100% correctly, you need to actually have



 
 a
 
 working D lexer (and possibly parser). You might be able to get
 something close enough to work in most cases, but it doesn't take



 
 all
 
 that much to throw off a basic implementation of this sort of thing



 
 if
 
 you don't lex/parse it with something which properly understands D.
 
 - Jonathan M Davis

 
 I didn't say it needs 100% accuracy. You can assume, for example,


 
 that
 
 "unittest" always occurs at the start of a line. The only other


 
 things
 
 you need to lex are {}, string literals, and comments.
 
 BTW, the immediate motivation for this is std.datetime in Phobos. The
 sheer number of unittests in there is an absolute catastrophe for
 tracking down bugs. It makes a tool like this MANDATORY.

 
 I tried to create a similar tool before and gave up because I couldn't
 make it 100% accurate and was running into problems with it. If

 
 someone
 
 wants to take a shot at it though, that's fine.
 
 As for the unit tests in std.datetime making it hard to track down

 
 bugs,
 
 that only makes sense to me if you're trying to look at the whole

 
 thing
 
 at once and track down a compiler bug which happens _somewhere_ in the
 code, but you don't know where. Other than a problem like that, I

 
 don't
 
 really see how the unit tests get in the way of tracking down bugs. Is
 it that you need to compile in a version of std.datetime which doesn't
 have any unit tests compiled in but you still need to compile with
 -unittest for other stuff?

 
 No. All you know there's a bug that's being triggered somewhere in
 Phobos (with -unittest). It's probably not in std.datetime.
 But Phobos is a horrible ball of mud where everything imports everything
 else, and std.datetime is near the centre of that ball. What you have to
 do is reduce the amount of code, and especially the number of modules,
 as rapidly as possible; this means getting rid of imports.
 
 To do this, you need to remove large chunks of code from the files. This
 is pretty simple; comment out half of the file, if it still works, then
 delete it. Normally this works well because typically only about a dozen
 lines are actually being used. After doing this about three or four
 times it's small enough that you can usually get rid of most of the
 imports. Unittests foul this up because they use functions/classes from
 inside the file.
 
 In the case of std.datetime it's even worse because the signal-to-noise
 ratio is so incredibly poor; it's really difficult to find the few lines
 of code that are actually being used by other Phobos modules.
 
 My experience (obviously only over the last month or so) has been that
 if the reduction of a bug is non-obvious, more than 10% of the total
 time taken to fix that bug is the time taken to cut down std.datetime.

 
 Hmmm. I really don't know what could be done to fix that (other than
 making it
 easier to rip out the unittest blocks). And enough of std.datetime
 depends on
 other parts of std.datetime that trimming it down isn't (and can't be)
 exactly
 easy. In general, SysTime is the most likely type to be used, and it
 depends
 on Date, TimeOfDay, and DateTime, and all 4 of those depend on most of
 the
 free functions in the module. It's not exactly designed in a manner which
 allows you to cut out large chunks and still have it compile. And I don't
 think that it _could_ be designed that way and still have the
 functionality
 that it has.
 
 I guess that this sort of problem is one that would pop up mainly when
 dealing
 with compiler bugs. I have a hard time seeing it popping up with your
 typical
 bug in Phobos itself. So, I guess that this is the sort of thing that
 you'd
 run into and I likely wouldn't.
 
 I really don't know how the situation could be improved though other than
 making it easier to cut out the unit tests.

 
 I was just thinking .. if we get a list of the symbols the linker is
 including, then write an app to take that list, and strip everything else
 out of the source .. would that work.  The Q's are how hard is it to get
 the symbols from the linker and then how hard is it to match those to
 source.  IIRC there are functions in phobos to convert to/from symbol
 names, so if the app had sufficient lexing and parsing capability it could
 match on those.

That would require a full-blown D lexer and parser.

- Jonathan M Davis

Mar 23 2011

"Regan Heath" <regan netmail.co.nz> writes:

On Wed, 23 Mar 2011 15:16:46 -0000, Jonathan M Davis <jmdavisProg gmx.com>  
wrote:
 I was just thinking .. if we get a list of the symbols the linker is
 including, then write an app to take that list, and strip everything  
 else
 out of the source .. would that work.  The Q's are how hard is it to get
 the symbols from the linker and then how hard is it to match those to
 source.  IIRC there are functions in phobos to convert to/from symbol
 names, so if the app had sufficient lexing and parsing capability it  
 could
 match on those.

 That would require a full-blown D lexer and parser.

 - Jonathan M Davis

Yeah, I thought as much.  I wonder if the new guy "Ilya" who just posted  
on digitalmars.D would find this interesting..

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

Mar 23 2011

Kai Meyer <kai unixlords.com> writes:

On 03/23/2011 09:16 AM, Jonathan M Davis wrote:
 On Sun, 20 Mar 2011 07:50:10 -0000, Jonathan M Davis<jmdavisProg gmx.com>

 wrote:
 Jonathan M Davis wrote:
 On Saturday 19 March 2011 18:04:57 Don wrote:
 Jonathan M Davis wrote:
 On Saturday 19 March 2011 17:11:56 Don wrote:
 Here's the task:
 Given a .d source file, strip out all of the unittest {} blocks,
 including everything inside them.
 Strip out all comments as well.
 Print out the resulting file.

 Motivation: Bug reports frequently come with very large test cases.
 Even ones which look small often import from Phobos.
 Reducing the test case is the first step in fixing the bug, and




 it's

 frequently ~30% of the total time required. Stripping out the unit
 tests is the most time-consuming and error-prone part of reducing




 the

 test case.

 This should be a good task if you're relatively new to D but would
 like to do something really useful.

 Unfortunately, to do that 100% correctly, you need to actually have



 a

 working D lexer (and possibly parser). You might be able to get
 something close enough to work in most cases, but it doesn't take



 all

 that much to throw off a basic implementation of this sort of thing



 if

 you don't lex/parse it with something which properly understands D.

 - Jonathan M Davis

 I didn't say it needs 100% accuracy. You can assume, for example,


 that

 "unittest" always occurs at the start of a line. The only other


 things

 you need to lex are {}, string literals, and comments.

 BTW, the immediate motivation for this is std.datetime in Phobos. The
 sheer number of unittests in there is an absolute catastrophe for
 tracking down bugs. It makes a tool like this MANDATORY.

 I tried to create a similar tool before and gave up because I couldn't
 make it 100% accurate and was running into problems with it. If

 someone

 wants to take a shot at it though, that's fine.

 As for the unit tests in std.datetime making it hard to track down

 bugs,

 that only makes sense to me if you're trying to look at the whole

 thing

 at once and track down a compiler bug which happens _somewhere_ in the
 code, but you don't know where. Other than a problem like that, I

 don't

 really see how the unit tests get in the way of tracking down bugs. Is
 it that you need to compile in a version of std.datetime which doesn't
 have any unit tests compiled in but you still need to compile with
 -unittest for other stuff?

 No. All you know there's a bug that's being triggered somewhere in
 Phobos (with -unittest). It's probably not in std.datetime.
 But Phobos is a horrible ball of mud where everything imports everything
 else, and std.datetime is near the centre of that ball. What you have to
 do is reduce the amount of code, and especially the number of modules,
 as rapidly as possible; this means getting rid of imports.

 To do this, you need to remove large chunks of code from the files. This
 is pretty simple; comment out half of the file, if it still works, then
 delete it. Normally this works well because typically only about a dozen
 lines are actually being used. After doing this about three or four
 times it's small enough that you can usually get rid of most of the
 imports. Unittests foul this up because they use functions/classes from
 inside the file.

 In the case of std.datetime it's even worse because the signal-to-noise
 ratio is so incredibly poor; it's really difficult to find the few lines
 of code that are actually being used by other Phobos modules.

 My experience (obviously only over the last month or so) has been that
 if the reduction of a bug is non-obvious, more than 10% of the total
 time taken to fix that bug is the time taken to cut down std.datetime.

 Hmmm. I really don't know what could be done to fix that (other than
 making it
 easier to rip out the unittest blocks). And enough of std.datetime
 depends on
 other parts of std.datetime that trimming it down isn't (and can't be)
 exactly
 easy. In general, SysTime is the most likely type to be used, and it
 depends
 on Date, TimeOfDay, and DateTime, and all 4 of those depend on most of
 the
 free functions in the module. It's not exactly designed in a manner which
 allows you to cut out large chunks and still have it compile. And I don't
 think that it _could_ be designed that way and still have the
 functionality
 that it has.

 I guess that this sort of problem is one that would pop up mainly when
 dealing
 with compiler bugs. I have a hard time seeing it popping up with your
 typical
 bug in Phobos itself. So, I guess that this is the sort of thing that
 you'd
 run into and I likely wouldn't.

 I really don't know how the situation could be improved though other than
 making it easier to cut out the unit tests.

 I was just thinking .. if we get a list of the symbols the linker is
 including, then write an app to take that list, and strip everything else
 out of the source .. would that work.  The Q's are how hard is it to get
 the symbols from the linker and then how hard is it to match those to
 source.  IIRC there are functions in phobos to convert to/from symbol
 names, so if the app had sufficient lexing and parsing capability it could
 match on those.

 That would require a full-blown D lexer and parser.

 - Jonathan M Davis

Why are we talking about having to recreate a full-blown lexer and 
parser when there has to be one that exists for D anyway? This is 
sounding more and more like you're asking the wrong crowd to solve a 
problem. To do it right, the people who have access to the real D lexer 
and parser would need to write this utility, and in some ways, it's 
already written since compiling with out a -unittest flag already omits 
all the unittests.

So I'm a bit confused about two things.

1) Why ask the wrong people to write the tool in the first place?
2) Why are we the wrong people any way?

-Kai Meyer

Mar 23 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

 On 03/23/2011 09:16 AM, Jonathan M Davis wrote:
 On Sun, 20 Mar 2011 07:50:10 -0000, Jonathan M
 Davis<jmdavisProg gmx.com>
 
 wrote:
 Jonathan M Davis wrote:
 On Saturday 19 March 2011 18:04:57 Don wrote:
 Jonathan M Davis wrote:
 On Saturday 19 March 2011 17:11:56 Don wrote:
 Here's the task:
 Given a .d source file, strip out all of the unittest {} blocks,
 including everything inside them.
 Strip out all comments as well.
 Print out the resulting file.
 
 Motivation: Bug reports frequently come with very large test
 cases. Even ones which look small often import from Phobos.
 Reducing the test case is the first step in fixing the bug, and




 
 it's
 
 frequently ~30% of the total time required. Stripping out the unit
 tests is the most time-consuming and error-prone part of reducing




 
 the
 
 test case.
 
 This should be a good task if you're relatively new to D but would
 like to do something really useful.

 
 Unfortunately, to do that 100% correctly, you need to actually have



 
 a
 
 working D lexer (and possibly parser). You might be able to get
 something close enough to work in most cases, but it doesn't take



 
 all
 
 that much to throw off a basic implementation of this sort of thing



 
 if
 
 you don't lex/parse it with something which properly understands D.
 
 - Jonathan M Davis

 
 I didn't say it needs 100% accuracy. You can assume, for example,


 
 that
 
 "unittest" always occurs at the start of a line. The only other


 
 things
 
 you need to lex are {}, string literals, and comments.
 
 BTW, the immediate motivation for this is std.datetime in Phobos.
 The sheer number of unittests in there is an absolute catastrophe
 for tracking down bugs. It makes a tool like this MANDATORY.

 
 I tried to create a similar tool before and gave up because I
 couldn't make it 100% accurate and was running into problems with
 it. If

 
 someone
 
 wants to take a shot at it though, that's fine.
 
 As for the unit tests in std.datetime making it hard to track down

 
 bugs,
 
 that only makes sense to me if you're trying to look at the whole

 
 thing
 
 at once and track down a compiler bug which happens _somewhere_ in
 the code, but you don't know where. Other than a problem like that,
 I

 
 don't
 
 really see how the unit tests get in the way of tracking down bugs.
 Is it that you need to compile in a version of std.datetime which
 doesn't have any unit tests compiled in but you still need to
 compile with -unittest for other stuff?

 
 No. All you know there's a bug that's being triggered somewhere in
 Phobos (with -unittest). It's probably not in std.datetime.
 But Phobos is a horrible ball of mud where everything imports
 everything else, and std.datetime is near the centre of that ball.
 What you have to do is reduce the amount of code, and especially the
 number of modules, as rapidly as possible; this means getting rid of
 imports.
 
 To do this, you need to remove large chunks of code from the files.
 This is pretty simple; comment out half of the file, if it still
 works, then delete it. Normally this works well because typically
 only about a dozen lines are actually being used. After doing this
 about three or four times it's small enough that you can usually get
 rid of most of the imports. Unittests foul this up because they use
 functions/classes from inside the file.
 
 In the case of std.datetime it's even worse because the
 signal-to-noise ratio is so incredibly poor; it's really difficult to
 find the few lines of code that are actually being used by other
 Phobos modules.
 
 My experience (obviously only over the last month or so) has been that
 if the reduction of a bug is non-obvious, more than 10% of the total
 time taken to fix that bug is the time taken to cut down std.datetime.

 
 Hmmm. I really don't know what could be done to fix that (other than
 making it
 easier to rip out the unittest blocks). And enough of std.datetime
 depends on
 other parts of std.datetime that trimming it down isn't (and can't be)
 exactly
 easy. In general, SysTime is the most likely type to be used, and it
 depends
 on Date, TimeOfDay, and DateTime, and all 4 of those depend on most of
 the
 free functions in the module. It's not exactly designed in a manner
 which allows you to cut out large chunks and still have it compile.
 And I don't think that it _could_ be designed that way and still have
 the
 functionality
 that it has.
 
 I guess that this sort of problem is one that would pop up mainly when
 dealing
 with compiler bugs. I have a hard time seeing it popping up with your
 typical
 bug in Phobos itself. So, I guess that this is the sort of thing that
 you'd
 run into and I likely wouldn't.
 
 I really don't know how the situation could be improved though other
 than making it easier to cut out the unit tests.

 
 I was just thinking .. if we get a list of the symbols the linker is
 including, then write an app to take that list, and strip everything
 else out of the source .. would that work.  The Q's are how hard is it
 to get the symbols from the linker and then how hard is it to match
 those to source.  IIRC there are functions in phobos to convert to/from
 symbol names, so if the app had sufficient lexing and parsing
 capability it could match on those.

 
 That would require a full-blown D lexer and parser.
 
 - Jonathan M Davis

 
 Why are we talking about having to recreate a full-blown lexer and
 parser when there has to be one that exists for D anyway? This is
 sounding more and more like you're asking the wrong crowd to solve a
 problem. To do it right, the people who have access to the real D lexer
 and parser would need to write this utility, and in some ways, it's
 already written since compiling with out a -unittest flag already omits
 all the unittests.
 
 So I'm a bit confused about two things.
 
 1) Why ask the wrong people to write the tool in the first place?
 2) Why are we the wrong people any way?

There are tasks for which you need to be able to lex and parse D code. To 100% 
correctly remove unit tests would be one such task. Another would be if you 
want a program to be able to syntax highlight some D code. Currently, as far 
as I know, there are only two lexers and two parsers for D: the C++ front end 
which dmd, gdc, and ldc use and the D front end which ddmd uses and which is 
based on the C++ front end. Both of those are under the GPL (which makes them 
useless for a lot of stuff) and both of them are tied to compilers. Being able 
to lex D code and get the list of tokens in a D program and being able to 
parse D code and get the resultant abstract syntax tree would be very useful 
for a number of programs.

So, while your average program may not care about being able to lex and parse 
D code, there _are_ programs that do, and being able to do so in D would be 
highly valuable for such programs. Previously Walter asked for a volunteer to 
port the lexer from the C++ front end to D under the Boost license to be put 
into Phobos (I volunteered for that and have been working on it off and on, 
slowly making progress on it). Andrei's reaction was that we should have a 
generic lexer which uses generic programming and is not tied to D at all, and 
_that_ is what someone may be working on for the GSoC (there are still solid 
arguments for having a D-specific lexer though, so hopefully we end up with 
both).

Now, for this particular problem, in order to track down certain types of 
compiler bugs, he needs to be able to build with -unittest but not have 
irrelevant code compiled in. So, for instance, if he's testing a bug related 
to compiling std.file with -unittest and it imported std.datetime, he would 
want to strip out as much as std.datetime as std.file doesn't need in order to 
minimize the code that he has to deal with to find the bug. std.datetime's 
unit tests are prime example of code that would be unnecessary. So, he wants a 
tool to strip the unit tests from a file. You can't use the compiler's lexer 
or parser to do that without a lot of changes. To do it 100% correctly, he 
needs a lexer (and possibly a parser) which can be used by a utility other 
than the compiler to read in a source file, strip out the unit tests, and then 
write out the file again. However, he's willing to settle for a utility that 
_mostly_ works, and you can do that without a full-blow D lexer or parser.

- Jonathan M Davis

Mar 23 2011

"Regan Heath" <regan netmail.co.nz> writes:

On Wed, 23 Mar 2011 21:16:02 -0000, Jonathan M Davis <jmdavisProg gmx.com>  
wrote:
 There are tasks for which you need to be able to lex and parse D code.  
 To 100% correctly remove unit tests would be one such task.

Is that last bit true?  You definitely need to be able to lex it, but  
instead of actually parsing it you just count { and } and remove  
'unittest' plus { plus } plus everything in between right?

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

Mar 25 2011

spir <denis.spir gmail.com> writes:

On 03/25/2011 12:08 PM, Regan Heath wrote:
 On Wed, 23 Mar 2011 21:16:02 -0000, Jonathan M Davis <jmdavisProg gmx.com>
wrote:
 There are tasks for which you need to be able to lex and parse D code. To
 100% correctly remove unit tests would be one such task.

 Is that last bit true? You definitely need to be able to lex it, but instead of
 actually parsing it you just count { and } and remove 'unittest' plus { plus }
 plus everything in between right?

At first sight, you're both wrong: you'd need to count { } levels. Also, I 
think true lexing is not really needed: you'd only need to put apart strings 
and comments that could hold non-code { & }.
(But these are only very superficial notes.)

Denis
-- 
_________________
vita es estrany
spir.wikidot.com

Mar 25 2011

Don <nospam nospam.com> writes:

spir wrote:
 On 03/25/2011 12:08 PM, Regan Heath wrote:
 On Wed, 23 Mar 2011 21:16:02 -0000, Jonathan M Davis 
 <jmdavisProg gmx.com> wrote:
 There are tasks for which you need to be able to lex and parse D 
 code. To
 100% correctly remove unit tests would be one such task.

 Is that last bit true? You definitely need to be able to lex it, but 
 instead of
 actually parsing it you just count { and } and remove 'unittest' plus 
 { plus }
 plus everything in between right?

 
 At first sight, you're both wrong: you'd need to count { } levels. Also, 
 I think true lexing is not really needed: you'd only need to put apart 
 strings and comments that could hold non-code { & }.
 (But these are only very superficial notes.)
 
 Denis

Yes, exactly: you just need to lex strings (including q{}), comments 
(which you remove),
unittest, and count levels of {.
You need to worry about backslashes in comments, but that's about it.

I even did this in a CTFE function once, I know it isn't complicated.
Should be possible in < 50 lines of code.
I just didn't want to have to do it myself.

In fact, it would be adequate to replace:
unittest
{
    blah...
}
with:
unittest{}

Then you don't need to worry about special cases like:

version(XXX)
unittest
{
...
}

Mar 25 2011

"Nick Sabalausky" <a a.a> writes:

"Regan Heath" <regan netmail.co.nz> wrote in message 
news:op.vswbv8qj54xghj puck.auriga.bhead.co.uk...
 On Wed, 23 Mar 2011 21:16:02 -0000, Jonathan M Davis <jmdavisProg gmx.com> 
 wrote:
 There are tasks for which you need to be able to lex and parse D code. 
 To 100% correctly remove unit tests would be one such task.

 Is that last bit true?  You definitely need to be able to lex it, but 
 instead of actually parsing it you just count { and } and remove 
 'unittest' plus { plus } plus everything in between right?

No, to do it 100% reliably, you do need lexing/parsing, and also the 
semantics stage. Example:

string makeATest(string str)
{
    return "unit"~"test { "~str~" }";
}
mixin(makeATest(q{
    // Do tests
}));

Mar 25 2011

Alexey Prokhin <alexey.prokhin yandex.ru> writes:

 Currently, as far as I know, there are only two lexers and two parsers for
 D: the C++ front end which dmd, gdc, and ldc use and the D front end which
 ddmd uses and which is based on the C++ front end. Both of those are under
 the GPL (which makes them useless for a lot of stuff) and both of them are
 tied to compilers. Being able to lex D code and get the list of tokens in
 a D program and being able to parse D code and get the resultant abstract
 syntax tree would be very useful for a number of programs.

There is a third one: http://code.google.com/p/dil/. The main page says that 
the lexer and the parser are fully implemented for both D1 and D2. But the 
license is also the GPL.

Mar 24 2011

"Nick Sabalausky" <a a.a> writes:

"Alexey Prokhin" <alexey.prokhin yandex.ru> wrote in message 
news:mailman.2713.1300954193.4748.digitalmars-d-learn puremagic.com...
 Currently, as far as I know, there are only two lexers and two parsers 
 for
 D: the C++ front end which dmd, gdc, and ldc use and the D front end 
 which
 ddmd uses and which is based on the C++ front end. Both of those are 
 under
 the GPL (which makes them useless for a lot of stuff) and both of them 
 are
 tied to compilers. Being able to lex D code and get the list of tokens in
 a D program and being able to parse D code and get the resultant abstract
 syntax tree would be very useful for a number of programs.

 There is a third one: http://code.google.com/p/dil/. The main page says 
 that
 the lexer and the parser are fully implemented for both D1 and D2. But the
 license is also the GPL.

The nearly-done v0.4 of my Goldie parsing system (zlib/libpng license) comes 
with a mostly-complete lexing-only grammar for D2.

http://www.dsource.org/projects/goldie/browser/trunk/lang/dlex.grm

The limitations of it right now:

- Doesn't do nested comments. That requires a feature (that's going to be 
introduced in the related tool GOLD Parsing System v4.2) that I haven't had 
a chance to add into Goldie just yet.

- It's possible there might be some edge-case bugs regarding either the ".." 
operator and/or float literals.

- It's ASCII-only. Goldie supports Unicode, but character set optimization 
isn't implemented yet, so unicode grammars are technically possible but 
impractical ATM (this will be the top priority after I get v0.4 released).

Mar 25 2011

"Nick Sabalausky" <a a.a> writes:

"Nick Sabalausky" <a a.a> wrote in message 
news:imivp7$2fu$1 digitalmars.com...
 "Alexey Prokhin" <alexey.prokhin yandex.ru> wrote in message 
 news:mailman.2713.1300954193.4748.digitalmars-d-learn puremagic.com...
 Currently, as far as I know, there are only two lexers and two parsers 
 for
 D: the C++ front end which dmd, gdc, and ldc use and the D front end 
 which
 ddmd uses and which is based on the C++ front end. Both of those are 
 under
 the GPL (which makes them useless for a lot of stuff) and both of them 
 are
 tied to compilers. Being able to lex D code and get the list of tokens 
 in
 a D program and being able to parse D code and get the resultant 
 abstract
 syntax tree would be very useful for a number of programs.

 There is a third one: http://code.google.com/p/dil/. The main page says 
 that
 the lexer and the parser are fully implemented for both D1 and D2. But 
 the
 license is also the GPL.

 The nearly-done v0.4 of my Goldie parsing system (zlib/libpng license) 
 comes with a mostly-complete lexing-only grammar for D2.

 http://www.dsource.org/projects/goldie/browser/trunk/lang/dlex.grm

 The limitations of it right now:

 - Doesn't do nested comments. That requires a feature (that's going to be 
 introduced in the related tool GOLD Parsing System v4.2) that I haven't 
 had a chance to add into Goldie just yet.

Note that this probably isn't a big of a problem as it sounds:

For one thing, it still recognizes "/+" and "+/" as tokens. It'll just try 
to lex everything in between too. And when Goldie is used to just lex, you 
still get the entire source lexed even if it has errors, and the lex-error 
tokens get included in the resulting token array. So it would be pretty easy 
to just call Goldie's lex function, and then step through the token array 
removing balanced /+ and +/ sections manually.

Mar 25 2011

spir <denis.spir gmail.com> writes:

On 03/24/2011 08:53 AM, Alexey Prokhin wrote:
 Currently, as far as I know, there are only two lexers and two parsers for
  D: the C++ front end which dmd, gdc, and ldc use and the D front end which
  ddmd uses and which is based on the C++ front end. Both of those are under
  the GPL (which makes them useless for a lot of stuff) and both of them are
  tied to compilers. Being able to lex D code and get the list of tokens in
  a D program and being able to parse D code and get the resultant abstract
  syntax tree would be very useful for a number of programs.

I fully support this. We desperately need it, I guess, working and maintained 
along language evolution.
This is the whole purpose of the GSOC proposal "D tools in D": 
http://prowiki.org/wiki4d/wiki.cgi?GSOC_2011_Ideas#DtoolsinD
Semantic analysis, introduced step by step, would be a huge plus.

Denis
-- 
_________________
vita es estrany
spir.wikidot.com

Mar 24 2011

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

On 3/23/11, Jonathan M Davis <jmdavisProg gmx.com> wrote:
 That would require a full-blown D lexer and parser.

 - Jonathan M Davis

Isn't DDMD written in D? I'm not sure about how finished it is though.

Mar 23 2011

"Nick Sabalausky" <a a.a> writes:

"Andrej Mitrovic" <andrej.mitrovich gmail.com> wrote in message 
news:mailman.2696.1300895928.4748.digitalmars-d-learn puremagic.com...
 On 3/23/11, Jonathan M Davis <jmdavisProg gmx.com> wrote:
 That would require a full-blown D lexer and parser.

 - Jonathan M Davis

 Isn't DDMD written in D? I'm not sure about how finished it is though.

I've done a little bit of playing around with DDMD for a (still only just 
barely-started) project, and it seems to be fairly well up to the task of 
building an AST and running semantics. It is still based on a somewhat older 
version of D2, though, and my understanding is that actually building a 
real-world program with it is still impractical (though I haven't tried).

Mar 25 2011

Michel Fortin <michel.fortin michelf.com> writes:

On 2011-03-19 20:41:09 -0400, Jonathan M Davis <jmdavisProg gmx.com> said:

 On Saturday 19 March 2011 17:11:56 Don wrote:
 Here's the task:
 Given a .d source file, strip out all of the unittest {} blocks,
 including everything inside them.
 Strip out all comments as well.
 Print out the resulting file.
 
 Motivation: Bug reports frequently come with very large test cases.
 Even ones which look small often import from Phobos.
 Reducing the test case is the first step in fixing the bug, and it's
 frequently ~30% of the total time required. Stripping out the unit tests
 is the most time-consuming and error-prone part of reducing the test case.
 
 This should be a good task if you're relatively new to D but would like
 to do something really useful.

 
 Unfortunately, to do that 100% correctly, you need to actually have a working D
 lexer (and possibly parser). You might be able to get something close enough to
 work in most cases, but it doesn't take all that much to throw off a basic
 implementation of this sort of thing if you don't lex/parse it with something
 which properly understands D.

Well, I made simple lexer for D strings, comments, identifiers and a 
few other tokens which should be up to that task. It's what I use to 
parse files and detect dependencies in D for Xcode. Unfortunately, it's 
written in Objective-C++ (but half of it is plain C)...

<https://github.com/michelf/d-for-xcode/blob/master/Sources/DXBaseLexer.h>
<https://github.com/michelf/d-for-xcode/blob/master/Sources/DXBaseLexer.mm>
<https://github.com/michelf/d-for-xcode/blob/master/Sources/DXScannerTools.h>
<https://github.com/michelf/d-for-xcode/blob/master/Sources/DXScannerTools.m>

Very short unit test:
<https://github.com/michelf/d-for-xcode/blob/master/Unit%20Tests/DXBaseLexerTest.mm>

-- 


Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Mar 19 2011

Kai Meyer <kai unixlords.com> writes:

On 03/19/2011 06:11 PM, Don wrote:
 Here's the task:
 Given a .d source file, strip out all of the unittest {} blocks,
 including everything inside them.
 Strip out all comments as well.
 Print out the resulting file.

 Motivation: Bug reports frequently come with very large test cases.
 Even ones which look small often import from Phobos.
 Reducing the test case is the first step in fixing the bug, and it's
 frequently ~30% of the total time required. Stripping out the unit tests
 is the most time-consuming and error-prone part of reducing the test case.

 This should be a good task if you're relatively new to D but would like
 to do something really useful.
 -Don

Is there a copy of the official D grammar somewhere online? I wrote a 
lexer for my Compiler class and would love to try and apply it to 
another grammar.

-Kai Meyer

Mar 20 2011

Zirneklis <zerneklis.web gmail.com> writes:

On 20/03/2011 19:55, Kai Meyer wrote:
 On 03/19/2011 06:11 PM, Don wrote:
 Here's the task:
 Given a .d source file, strip out all of the unittest {} blocks,
 including everything inside them.
 Strip out all comments as well.
 Print out the resulting file.

 Motivation: Bug reports frequently come with very large test cases.
 Even ones which look small often import from Phobos.
 Reducing the test case is the first step in fixing the bug, and it's
 frequently ~30% of the total time required. Stripping out the unit tests
 is the most time-consuming and error-prone part of reducing the test
 case.

 This should be a good task if you're relatively new to D but would like
 to do something really useful.
 -Don

 Is there a copy of the official D grammar somewhere online? I wrote a
 lexer for my Compiler class and would love to try and apply it to
 another grammar.

 -Kai Meyer

As far as I know the documentation /is/ the official grammar
http://digitalmars.com/d/2.0/lex.html

Mar 20 2011

Trass3r <un known.com> writes:

 Is there a copy of the official D grammar somewhere online? I wrote a  
 lexer for my Compiler class and would love to try and apply it to  
 another grammar.

The official D grammar is spread among the specification.
But I recall that someone compiled a complete grammar for D1 some time ago.

Mar 24 2011

Ary Manzana <ary esperanto.org.ar> writes:

On 3/19/11 9:11 PM, Don wrote:
 Here's the task:
 Given a .d source file, strip out all of the unittest {} blocks,
 including everything inside them.
 Strip out all comments as well.
 Print out the resulting file.

 Motivation: Bug reports frequently come with very large test cases.
 Even ones which look small often import from Phobos.
 Reducing the test case is the first step in fixing the bug, and it's
 frequently ~30% of the total time required. Stripping out the unit tests
 is the most time-consuming and error-prone part of reducing the test case.

 This should be a good task if you're relatively new to D but would like
 to do something really useful.
 -Don

Can it be done in Ruby? Or you need it in D?

Mar 20 2011

"Simen kjaeraas" <simen.kjaras gmail.com> writes:

On Mon, 21 Mar 2011 01:52:45 +0100, Ary Manzana <ary esperanto.org.ar>  
wrote:

 On 3/19/11 9:11 PM, Don wrote:
 Here's the task:
 Given a .d source file, strip out all of the unittest {} blocks,
 including everything inside them.
 Strip out all comments as well.
 Print out the resulting file.

 Motivation: Bug reports frequently come with very large test cases.
 Even ones which look small often import from Phobos.
 Reducing the test case is the first step in fixing the bug, and it's
 frequently ~30% of the total time required. Stripping out the unit tests
 is the most time-consuming and error-prone part of reducing the test  
 case.

 This should be a good task if you're relatively new to D but would like
 to do something really useful.
 -Don

 Can it be done in Ruby? Or you need it in D?

Part of the idea was that someone use it to learn D. However, the important
part is that it's done. Doing it in D would be preferable, but not a
requisite.


-- 
Simen

Mar 21 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

 On 3/23/11, Jonathan M Davis <jmdavisProg gmx.com> wrote:
 That would require a full-blown D lexer and parser.
 
 - Jonathan M Davis

 
 Isn't DDMD written in D? I'm not sure about how finished it is though.

Yes, but the lexer and parser in ddmd are not only GPL (which would be a 
problem for some stuff but not others - for something like Don's utility, it 
wouldn't be a problem), and more importantly, it is tied to the compiler code. 
It's not designed to be used by an arbitrary program. For that, you would need 
a lexer and parser which were designed with an API such that an arbitrary D 
program could use them. For instance, the lexer could produce a range of 
tokens to be processed, and a program which wants to use the lexer can then 
process that range.

- Jonathan M Davis

Mar 23 2011

"Nick Sabalausky" <a a.a> writes:

"Jonathan M Davis" <jmdavisProg gmx.com> wrote in message 
news:mailman.2700.1300915109.4748.digitalmars-d-learn puremagic.com...
 On 3/23/11, Jonathan M Davis <jmdavisProg gmx.com> wrote:
 That would require a full-blown D lexer and parser.

 - Jonathan M Davis

 Isn't DDMD written in D? I'm not sure about how finished it is though.

 Yes, but the lexer and parser in ddmd are not only GPL (which would be a
 problem for some stuff but not others - for something like Don's utility, 
 it
 wouldn't be a problem), and more importantly, it is tied to the compiler 
 code.
 It's not designed to be used by an arbitrary program. For that, you would 
 need
 a lexer and parser which were designed with an API such that an arbitrary 
 D
 program could use them. For instance, the lexer could produce a range of
 tokens to be processed, and a program which wants to use the lexer can 
 then
 process that range.

I don't know about the license issues, but I don't think the API is a big 
deal. I'm in the early stages of a DDMD-based project to compile D code down 
to Haxe, and all I really had to do was comment out the backend-related 
section at the end of main(), inject my AST-walking/processing functions 
into the AST classes (though, admittedly, there is 1.5 metric fuckton of 
these AST classes), and then add a little bit of code at the end of main() 
to launch my AST-traversal. The main() function could easily be converted to 
a non-main one.

The only real difficultly is the fact that the AST isn't really documented, 
except for what little exists on one particular Wiki4D page (sorry, don't 
have the link ATM).

Hmm, although, depending what you're doing with it, you may also want to 
hook DDMD's stdout/stderr output, or at least the error/warning functions.

Mar 25 2011

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

On 3/23/11, Jonathan M Davis <jmdavisProg gmx.com> wrote:
 On 3/23/11, Jonathan M Davis <jmdavisProg gmx.com> wrote:
 That would require a full-blown D lexer and parser.

 - Jonathan M Davis

 Isn't DDMD written in D? I'm not sure about how finished it is though.

 Yes, but the lexer and parser in ddmd are not only GPL (which would be a
 problem for some stuff but not others - for something like Don's utility, it
 wouldn't be a problem), and more importantly, it is tied to the compiler
 code.
 It's not designed to be used by an arbitrary program. For that, you would
 need
 a lexer and parser which were designed with an API such that an arbitrary D
 program could use them. For instance, the lexer could produce a range of
 tokens to be processed, and a program which wants to use the lexer can then
 process that range.

 - Jonathan M Davis

I didn't even know it was GPL. It doesn't come with a license file.

Mar 23 2011

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

What about the artistic license, the front-end can be used with that
license. Is that less restrictive than GPL?

Mar 23 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

 What about the artistic license, the front-end can be used with that
 license. Is that less restrictive than GPL?

I don't know what the exact licensing situation is. However, as I understand 
it, the C++ front-end is under the GPL, and therefore because ddmd is based on 
the C++ front-end, it is also under the GPL. If that's not the case, I don't 
know what the licensing situation really is. And I don't know what the 
artistic license says exactly, so I don't know what its restrictions are.

- Jonathan M Davis

Mar 23 2011

D Programming

C/C++ Programming

Other

digitalmars.D.learn - Want to help DMD bugfixing? Write a simple utility.