www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - About the Expressiveness of D

reply "Jonas Drewsen" <jdrewsen nospam.com> writes:
Article about the expressiveness of languages with D included as 
one of the contestants.

http://redmonk.com/dberkholz/2013/03/25/programming-languages-ranked-by-expressiveness/

I tend to agree with the first comment to the article though :)

/Jonas
Apr 02 2013
next sibling parent "Paulo Pinto" <pjmlp progtools.org> writes:
On Tuesday, 2 April 2013 at 07:59:17 UTC, Jonas Drewsen wrote:
 Article about the expressiveness of languages with D included 
 as one of the contestants.

 http://redmonk.com/dberkholz/2013/03/25/programming-languages-ranked-by-expressiveness/

 I tend to agree with the first comment to the article though :)

 /Jonas

And me with the one about Go.
Apr 02 2013
prev sibling next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 4/2/2013 12:59 AM, Jonas Drewsen wrote:
 Article about the expressiveness of languages with D included as one of the
 contestants.

 http://redmonk.com/dberkholz/2013/03/25/programming-languages-ranked-by-expressiveness/

It's an interesting metric, but there are too many obvious confounding variables to assume that expressiveness has the first order effect.
Apr 02 2013
prev sibling next sibling parent reply Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 04/02/2013 09:59 AM, Jonas Drewsen wrote:
 Article about the expressiveness of languages with D included as one of the
 contestants.

Personal feeling here -- there's a difference between how expressive a language can be (even, how expressive it can _easily_ be) versus how expressively programmers tend to use it. I think my own use of D tends to be heavily biased by my background in C/C++ and my lack of training in more expressively-focused development styles. D allows me to write in those paradigms I feel comfortable with -- and so my use of it is almost certainly less expressive than it could be. That feeling is supported by how wide D's error bars are in those plots -- that diversity may well reflect the number of styles of programming one can adopt within the language. I'm surprised that the extreme lower values for the statistic still seem high relative to other languages, but that in turn might reflect the state of development of the language, with new features being added fairly regularly to the standard library (probably larger commits). I also have a strong feeling that LOC per commit reflects too many different factors to be really reliable as a comparison, e.g. it probably depends quite strongly on the age/maturity of a project, the rate of development, and other factors. Reading some later posts on the same blog, the author acknowledges some of these kinds of complications: http://redmonk.com/dberkholz/2013/03/26/what-does-expressiveness-via-loc-per-commit-measure-in-practice/
Apr 02 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 4/2/2013 2:53 AM, Joseph Rushton Wakeling wrote:
 I also have a strong feeling that LOC per commit reflects too many different
 factors to be really reliable as a comparison, e.g. it probably depends quite
 strongly on the age/maturity of a project, the rate of development, and other
 factors.

Consider also that this LOC numbers are not lines of code - they're also lines of comments! D's ddoc encourages writing considerably more lines of comments than C does.
Apr 02 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 4/2/2013 4:55 PM, Jesse Phillips wrote:
 I usually find the build in unittests to cause more skew since those are
counted
 as LOC.

Often, in pulls for D, the LOC of the unittests exceeds the LOC of the fix. I'm inordinately pleased with how well unittests have become embedded in our D culture.
Apr 02 2013
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 4/2/13 10:13 PM, Jonathan M Davis wrote:
 On Tuesday, April 02, 2013 17:01:32 Walter Bright wrote:
 On 4/2/2013 4:55 PM, Jesse Phillips wrote:
 I usually find the build in unittests to cause more skew since those are
 counted as LOC.

Often, in pulls for D, the LOC of the unittests exceeds the LOC of the fix. I'm inordinately pleased with how well unittests have become embedded in our D culture.

Yes, though I've had complaints before about a pull being too much code where the unit tests were considered part of the code, and the reviewer thought that number of lines was too great to be worth adding, even if the number of lines of normal code was relatively small. And that sort of attitude would just lead to not properly unit testing stuff.

I think it leads to writing less repetitive unittests. If we did datetime all over again, I'd give a budget of 2000 lines for all functionality. I bet the solution would be better. Andrei
Apr 02 2013
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 4/2/2013 8:03 PM, Jonathan M Davis wrote:
 Too many of them just test a few cases to make sure that the most
 obvious stuff works rather than making sure they test corner cases and whatnot.

Currently, the datetime unittest coverage is 95%. Some of the 0 cases suggest low hanging fruit. Despite what I just said, datetime has one of the highest unittest coverages of any phobos module. Pretty much all of the phobos module unittest coverage testing indicates more work is needed. Minor perf improvement: the order of the tests in yearIsLeapYear() should be reversed, especially since signed divide is a very slow operation, and it is called 20 million times by the unittests!!!
Apr 02 2013
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 4/3/2013 11:08 AM, Jonathan M Davis wrote:
 (with most or all of the missing lines being due to stuff like catching
 Exception and asserting 0 in the catch block for making a function nothrow
 when you know that the code being called will never throw)

Why not just mark them as nothrow? Let the compiler statically check it.
Apr 03 2013
parent Walter Bright <newshound2 digitalmars.com> writes:
On 4/3/2013 11:44 AM, Jonathan M Davis wrote:
 the catch is necessary in order to mark the function as nothrow, because
 format _could_ throw. It's just that given the arguments, you know that it
 never will.

Agreed.
Apr 03 2013
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 4/3/2013 11:08 AM, Jonathan M Davis wrote:
 I'm very much in
 favor of having 100% test coverage on every line that _can_ be tested (there
 may be rare exceptions to that, but I don't think that std.datetime has any of
 them).

I'd be shocked if running -cov for the first time *didn't* come up with issues.
Apr 03 2013
parent Walter Bright <newshound2 digitalmars.com> writes:
On 4/3/2013 11:44 AM, Jonathan M Davis wrote:
 Yes. My point was that 100% should be the goal, whereas I know a number of
 developers who consider something like 70% to be sufficient - and these are
 folks who actually believe in writing unit tests. Certainly, expecting to hit
 100% with -cov on the first try isn't generally very realistic unless you're
 always extremely thorough with your tests, and even then, it's easy to miss a
 line or two on rarer branches, especially as functions become more complex.

Cov testing also has a tendency to expose dead code - not just insufficient unit tests.
Apr 03 2013
prev sibling next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2013-04-03 05:03, Jonathan M Davis wrote:

 I very much doubt that you could do that unless you specifically formatted the
 code to take up as few lines as possible and didn't count the unit tests or
 documentation in that line count. Otherwise, you couldn't do anything even
 close to what std.datetime does in that few lines. Sure, some functionality
 could be stripped, but you'd end up with something that did a lot less if it
 were that small. The unit tests and documentation do make it seem like a lot
 more code than it is, since they take up well over half the file (probably
 3/4), but you'd definitely lose functionality with that few lines of code, and
 you'd end up with something very poor IMHO if those 2000 lines included the
 documentation and unit tests. You'd either end up with something that was very
 bare-bones and/or something which was poorly tested, and given how easy it is
 to screw up some of those date/time calculations, having only a few tests
 would be a very bad idea.

Since he wrote "2000 lines for all functionality", I don't think he included unit tests or docs/comments.
 std.datetime's unit tests do need some refactoring (some of which I've done,
 but there's still a fair bit of work to do there), which will definitely reduce
 the number of LOC that they take up, but I don't agree at all with considering
 the unit tests as part of the LOC of file when discussing keeping LOC to a
 minimum. And while it's good to avoid repetitive unit tests, I'd much rather
 have repetitive unit tests which are thorough than short ones which aren't. I
 find your focus on trying to keep unit tests to a minimum to be disturbing and
 likely to lead to poorly tested code.

 If anything, we need to be more thorough, not less. That doesn't mean that the
 tests need to look like what std.datetime has (particularly since I
 purposefully avoided loops and other more complicated constructs when I wrote
 them originally in order to make them as simple and as far from error-prone as
 possible), but unit tests need to be thorough, and while we're getting better,
 Phobos' unit tests frequently aren't thorough enough (particularly in
 std.range and std.algorithm when it comes to testing a variety of range
 types). Too many of them just test a few cases to make sure that the most
 obvious stuff works rather than making sure they test corner cases and whatnot.

 - Jonathan M Davis

I actually prefer to have repetitive unit tests and not using loops to make it clear what they actually do. Here's an example from our code base, in Ruby: describe "Swedish" do subject { build(:address) { |a| a.country_id = Country::SWEDEN } } it { should validate_postal_code(12345) } it { should validate_postal_code(85412) } it { should_not validate_postal_code(123) } it { should_not validate_postal_code(123456) } it { should_not validate_postal_code("05412") } it { should_not validate_postal_code("fooba") } end describe "Finnish" do subject { build(:address) { |a| a.country_id = Country::FINLAND } } it { should validate_postal_code(12345) } it { should validate_postal_code(12354) } it { should validate_postal_code(41588) } it { should validate_postal_code("00123") } it { should validate_postal_code("01588") } it { should validate_postal_code("00000") } it { should_not validate_postal_code(1234) } it { should_not validate_postal_code(123456) } it { should_not validate_postal_code("fooba") } end It could be written less repetitive, like this: postal_codes = { Country::SWEDEN => { valid: [12345, 85412], invalid: [123, 123456, "05412", "fooba"] }, Country::FINLAND => { valid: [12345, 12354, 41588], invalid: ["00123", "01588", "00000", 1234, 123456, "fooba"] } } postal_codes.each do |country_id, postal_codes| describe c.english_name do subject { build(:address) { |a| a.country_id = country_id } } postal_codes[:valid].each do |postal_code| it { should validate_postal_code(postal_code) } end postal_codes[:invalid].each do |postal_code| it { should_not validate_postal_code(postal_code) } end end end But I don't think that looks any better. I think it's much worse. -- /Jacob Carlborg
Apr 02 2013
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 4/3/13 2:53 AM, Jacob Carlborg wrote:
 On 2013-04-03 05:03, Jonathan M Davis wrote:
 I actually prefer to have repetitive unit tests and not using loops to
 make it clear what they actually do. Here's an example from our code
 base, in Ruby:

 describe "Swedish" do
 subject { build(:address) { |a| a.country_id = Country::SWEDEN } }

 it { should validate_postal_code(12345) }
 it { should validate_postal_code(85412) }

 it { should_not validate_postal_code(123) }
 it { should_not validate_postal_code(123456) }

 it { should_not validate_postal_code("05412") }
 it { should_not validate_postal_code("fooba") }
 end

 describe "Finnish" do
 subject { build(:address) { |a| a.country_id = Country::FINLAND } }

 it { should validate_postal_code(12345) }
 it { should validate_postal_code(12354) }
 it { should validate_postal_code(41588) }

 it { should validate_postal_code("00123") }
 it { should validate_postal_code("01588") }
 it { should validate_postal_code("00000") }

 it { should_not validate_postal_code(1234) }
 it { should_not validate_postal_code(123456) }
 it { should_not validate_postal_code("fooba") }
 end

 It could be written less repetitive, like this:

 postal_codes = {
 Country::SWEDEN => {
 valid: [12345, 85412],
 invalid: [123, 123456, "05412", "fooba"]
 },

 Country::FINLAND => {
 valid: [12345, 12354, 41588],
 invalid: ["00123", "01588", "00000", 1234, 123456, "fooba"]
 }
 }

 postal_codes.each do |country_id, postal_codes|
 describe c.english_name do
 subject { build(:address) { |a| a.country_id = country_id } }

 postal_codes[:valid].each do |postal_code|
 it { should validate_postal_code(postal_code) }
 end

 postal_codes[:invalid].each do |postal_code|
 it { should_not validate_postal_code(postal_code) }
 end
 end
 end

 But I don't think that looks any better. I think it's much worse.

The way I see it, the first is terrible and the second asks for better focus on a data-driven approach. Andrei
Apr 03 2013
parent reply Jacob Carlborg <doob me.com> writes:
On 2013-04-03 19:39, Andrei Alexandrescu wrote:

 The way I see it, the first is terrible and the second asks for better
 focus on a data-driven approach.

Stupid me, posting on Ruby. -- /Jacob Carlborg
Apr 03 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 4/3/13 2:55 PM, Jacob Carlborg wrote:
 On 2013-04-03 19:39, Andrei Alexandrescu wrote:

 The way I see it, the first is terrible and the second asks for better
 focus on a data-driven approach.

Stupid me, posting on Ruby.

I was referring to the repeatability of the code used in testing, which is language-independent. Andrei
Apr 03 2013
parent reply Jacob Carlborg <doob me.com> writes:
On 2013-04-03 22:50, Andrei Alexandrescu wrote:

 I was referring to the repeatability of the code used in testing, which
 is language-independent.

I think the first one is far more readable then the one using the loop. -- /Jacob Carlborg
Apr 04 2013
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 4/4/13 10:26 AM, Jacob Carlborg wrote:
 On 2013-04-03 22:50, Andrei Alexandrescu wrote:

 I was referring to the repeatability of the code used in testing, which
 is language-independent.

I think the first one is far more readable then the one using the loop.

I understand. And I think you are very wrong about that. Andrei
Apr 04 2013
prev sibling parent Jacob Carlborg <doob me.com> writes:
On 2013-04-03 20:08, Jonathan M Davis wrote:

 In general, I agree, because I think that straight-forward tests that avoid
 loops and the like are far less error-prone, and you need the tests to not be
 buggy. I don't want to have to test my test code to make sure that it works
 correctly.

 However, I _do_ think that there's something to be said for refactoring the
 tests later (after the code supposedly fully works) to use loops and other
 more complicated constructs, because not only can that lead to more compact
 tests, but it also makes it much easier to make the tests more thorough
 (without taking many more lines of code). I just think that _starting out_
 with the more complicated tests is not necessarily a good idea. Treating unit
 testing code as if it were the same is normal code doesn't make sense to me,
 if nothing else, because that would indicate that you're going to have to test
 your test code, since normal code is complicated enough to require testing.
 But Andrei and I have argued about this before, and I don't expect us to agree
 ever on it.

I do refactor tests, but mostly the data. At work I think we have pretty DRY tests, mostly the data. Using factories and other functionality to keep the code simple and DRY. "validate_postal_code" is a function written specifically for the tests above to keep it DRY. -- /Jacob Carlborg
Apr 03 2013
prev sibling next sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
03-Apr-2013 19:55, Peter Alexander пишет:
 On Wednesday, 3 April 2013 at 02:44:15 UTC, Andrei Alexandrescu wrote:
 If we did datetime all over again, I'd give a budget of 2000 lines for
 all functionality. I bet the solution would be better.

I think you are massively underestimating the complexity and subtleties of dates and time.

+1
 For comparison, min and max in std.algorithm come to nearly 200 lines on
 their own, and their unittests are hopelessly lacking. Things like
 min(uint.min, int.max) are not tested, even though there's specific code
 to handle them. To suggest that date and time handling is a mere 10x
 more complex than min/max is a bit naive in my opinion.

-- Dmitry Olshansky
Apr 03 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 4/3/2013 9:49 AM, Dmitry Olshansky wrote:
 +1

Stylistic nit: When writing a one-liner post like this, please do not quote the entire preceding post, especially if it is long. We have great forum software, and the newsreaders as well are great at navigating the threads. Not to pick on you, but I see this a lot here from many of our participants and finally felt compelled to speak up! And yes, I know that sometimes people complain that I do the opposite in not quoting enough of the parent.
Apr 03 2013
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 4/3/13 11:24 PM, Steven Schveighoffer wrote:
 On Wed, 03 Apr 2013 14:42:12 -0400, Walter Bright
 <newshound2 digitalmars.com> wrote:

 On 4/3/2013 9:49 AM, Dmitry Olshansky wrote:
 +1

Stylistic nit: When writing a one-liner post like this, please do not quote the entire preceding post, especially if it is long. We have great forum software, and the newsreaders as well are great at navigating the threads.

I couldn't disagree more. The given +1 had 4 lines of context. There was some straggling text after it, but this was only an additional 5 lines.

I'm with Walter. The top context was fine for that message. The bottom was not seeing as the poster had nothing to say about it. Deleting the bottom is good common courtesy. Walter himself used to leave vast amounts of trailing context in our communication, and it saved me significant time when he started to consistently trim it. With trailing chaff, essentially every reader needs to scroll down to find "is there anything more this guy wanted to add"? Some don't even insert an empty line.
 My newsreader highlights replied-to text in different colors depending
 on the level of indent. I can immediately pick out new replies, and if I
 don't want to read the re-posted stuff, I don't have to, unless I want
 to for context.

Mine too, but that doesn't make the problem go away.
 Newsreaders are known not to thread things properly, and some people's
 posts don't thread properly ANYWHERE. Context is important.

Yes, just not trailing chaff.
 Not to pick on you, but I see this a lot here from many of our
 participants and finally felt compelled to speak up!

I find posts that are solely about how you didn't "post properly" annoying. Kind of like compulsively telling someone they didn't use correct grammar (for which I have to fight my instincts in order to remain married). Sorry, I had to say something ;)

Such posts are good because netiquette is not as widespread and as agreed upon as grammar. Andrei
Apr 04 2013
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 4/3/13 11:55 AM, Peter Alexander wrote:
 On Wednesday, 3 April 2013 at 02:44:15 UTC, Andrei Alexandrescu wrote:
 If we did datetime all over again, I'd give a budget of 2000 lines for
 all functionality. I bet the solution would be better.

I think you are massively underestimating the complexity and subtleties of dates and time.

May as well. I recall before I approved std.datetime I looked at the implementation sizes of similar functionality in other languages; they were all rather bulky, but std.datetime was at the high end of the range.
 For comparison, min and max in std.algorithm come to nearly 200 lines on
 their own, and their unittests are hopelessly lacking. Things like
 min(uint.min, int.max) are not tested, even though there's specific code
 to handle them. To suggest that date and time handling is a mere 10x
 more complex than min/max is a bit naive in my opinion.

To put things in perspective, std.datetime has 34K lines, whereas std.algorithm has under 12K lines. The entire std/ has 191K lines. I'd be hard pressed to assess that that high proportion is justified. Say we set out to fit std.datetime in e.g. 20K lines without loss in functionality or testing, which I'd find more reasonable. I think the result would force overall better engineering of the entire thing (and in particular better use of data structures) - constraints may be liberating. Andrei
Apr 03 2013
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 4/3/13 1:59 PM, Brad Anderson wrote:
 On Wednesday, 3 April 2013 at 17:08:57 UTC, Andrei Alexandrescu wrote:
 On 4/3/13 11:55 AM, Peter Alexander wrote:
 On Wednesday, 3 April 2013 at 02:44:15 UTC, Andrei Alexandrescu wrote:
 If we did datetime all over again, I'd give a budget of 2000 lines for
 all functionality. I bet the solution would be better.

I think you are massively underestimating the complexity and subtleties of dates and time.

May as well. I recall before I approved std.datetime I looked at the implementation sizes of similar functionality in other languages; they were all rather bulky, but std.datetime was at the high end of the range.

Boost datetime is 27k. Just the headers comes to 17k. A 2k budget for a date time library is unreasonable unless you don't want anyone using D for anything serious involving dates and times. They are complex and require a lot of code to get right. Perhaps 34k is too large but 2k is laughable.

Agreed. I just pulled that number randomly without having looked at the current line count. Andrei
Apr 03 2013
prev sibling next sibling parent Jacob Carlborg <doob me.com> writes:
On 2013-04-03 21:28, Simen Kjaeraas wrote:

 Removed all comments, unittests, and empty lines from std.datetime. File
 went from 34070 to 5843 lines.

Heheh, that's more reasonable. That's also why I don't like to have unit tests inline. -- /Jacob Carlborg
Apr 04 2013
prev sibling parent Jacob Carlborg <doob me.com> writes:
On 2013-04-04 03:47, Jesse Phillips wrote:

 cloc doesn't support /+ comments... But using your number, cloc, and
 some math

std.datetime contains mostly /+ and // comments. It only contains a single /* comment. -- /Jacob Carlborg
Apr 04 2013
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 4/2/13 11:03 PM, Jonathan M Davis wrote:
 I
 find your focus on trying to keep unit tests to a minimum to be disturbing and
 likely to lead to poorly tested code.

Well that's quite the assumption. Andrei
Apr 03 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 4/3/2013 10:58 AM, Jonathan M Davis wrote:
 If you push for the lines of unit testing code to be kept to a minimum, I
 don't see how you can possibly expect stuff to be thoroughly tested.

My idea of perfection would be 100% coverage with zero redundancy in the unittests. In my experience with testing, the technique of "quantity has a quality all its own" style of testing does not produce adequate test coverage - it just simply takes a lot of time to run (which makes it less useful, as one then tends to avoid running them).
Apr 03 2013
parent Walter Bright <newshound2 digitalmars.com> writes:
On 4/3/2013 11:56 AM, Jonathan M Davis wrote:
 Certainly, I agree that having the minimal tests required to test everything
 that needs testing should be the goal, but figuring out which tests are and
 aren't really needed is a bit of art.

That's why we are engineers, and not mere code monkeys.
 Actually, I'd argue that in perfect world, you'd test absolutely every
 possible input to make sure that it had the correct output, but that's
 obviously impossible in all but the most simplistic code,

We can exploit mathematics to reduce the test cases while testing thoroughly. In physics I learned to test one's solution with the boundary cases and a couple of known cases. Mathematically, that was sufficient.
Apr 03 2013
prev sibling parent Chad Joan <chadjoan gmail.com> writes:
On 04/02/2013 10:44 PM, Andrei Alexandrescu wrote:
 I think it leads to writing less repetitive unittests.

 If we did datetime all over again, I'd give a budget of 2000 lines for
 all functionality. I bet the solution would be better.


 Andrei

My problem with datetime is that it is too monolithic. I really wish it was split into about 3 different modules. This is frustrating from a user-perspective. The docs for that thing can easily make someone's eyes gloss over. If you split it up, then the LOC per module would become smaller too, as a side-effect.
Apr 05 2013
prev sibling next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2013-04-03 04:13, Jonathan M Davis wrote:

 Yes, though I've had complaints before about a pull being too much code where
 the unit tests were considered part of the code, and the reviewer thought that
 number of lines was too great to be worth adding, even if the number of lines
 of normal code was relatively small. And that sort of attitude would just lead
 to not properly unit testing stuff. And while we do some great unit testing
 (the built in unit test feature is a _huge_ success in that regard), there are
 at least some areas where we really need to step up our game on that (with
 ranges in particular given all of the variations of them there are and how
 many static if branches many range-based functions have).

The problem is having the unit tests in the same file. Yes, I know, most of you love it, I don't. -- /Jacob Carlborg
Apr 02 2013
parent Jacob Carlborg <doob me.com> writes:
On 2013-04-03 08:45, Andrej Mitrovic wrote:

 One thing I noticed is that having unittests in separate files can
 catch issues with template mixins.

 If you have any private or protected functions that are used by a
 mixin template, the mixin template will not compile once the user
 tries to use it in his own code.

 There are workarounds, of course, like putting functions inside of the
 template. But the point still stands that you need to also test the
 library externally.

I didn't think about that.
 Another thing local unittests don't test are symbol clashes. If a user
 imports lib.a and lib.b from your library, he probably doesn't expect
 to get symbol clashes.

Most likely not, but there's nothing wrong with it. We do have modules for a reason. It's fairly easy do solve for the user if the issue comes up. If there are some common names that always clash, then there are some problems.
 In fact Phobos has had symbol clashes before, and we're working on
 getting rid of them (e.g. through deprecation stages). But if Phobos
 also had external test-cases then we could have avoided symbol clashes
 to begin with.

I don't know if that's something unit tests should explicitly test for. -- /Jacob Carlborg
Apr 03 2013
prev sibling parent Chad Joan <chadjoan gmail.com> writes:
On 04/02/2013 08:01 PM, Walter Bright wrote:
 On 4/2/2013 4:55 PM, Jesse Phillips wrote:
 I usually find the build in unittests to cause more skew since those
 are counted
 as LOC.

Often, in pulls for D, the LOC of the unittests exceeds the LOC of the fix. I'm inordinately pleased with how well unittests have become embedded in our D culture.

I think this has made me a much better programmer. And it did so a long time ago. Big win!
Apr 05 2013
prev sibling next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 04/02/2013 11:15 AM, Walter Bright wrote:
 It's an interesting metric, but there are too many obvious confounding
variables
 to assume that expressiveness has the first order effect.

Between your response and mine, I think we have a rather good illustration of this for the English language, never mind programming ... :-)
Apr 02 2013
prev sibling next sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Jonas Drewsen:
 Article about the expressiveness of languages with D included 
 as one of the contestants.

 http://redmonk.com/dberkholz/2013/03/25/programming-languages-ranked-by-expressiveness/

I think D is quite expressive: http://forum.dlang.org/thread/zdhfpftodxnvbpwvklcv forum.dlang.org Bye, bearophile
Apr 02 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 4/2/13 6:04 AM, bearophile wrote:
 Jonas Drewsen:
 Article about the expressiveness of languages with D included as one
 of the contestants.

 http://redmonk.com/dberkholz/2013/03/25/programming-languages-ranked-by-expressiveness/

I think D is quite expressive: http://forum.dlang.org/thread/zdhfpftodxnvbpwvklcv forum.dlang.org Bye, bearophile

I meant to comment on this - it's a terrific walkthrough. I think bearophile should convert it into a blog post/article. I think reddit would love it. The suggestions included (such as enumerate()) are also very worth looking into. Andrei
Apr 02 2013
next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 4/2/2013 1:59 PM, Andrei Alexandrescu wrote:
 On 4/2/13 6:04 AM, bearophile wrote:
 Jonas Drewsen:
 Article about the expressiveness of languages with D included as one
 of the contestants.

 http://redmonk.com/dberkholz/2013/03/25/programming-languages-ranked-by-expressiveness/

I think D is quite expressive: http://forum.dlang.org/thread/zdhfpftodxnvbpwvklcv forum.dlang.org Bye, bearophile

I meant to comment on this - it's a terrific walkthrough. I think bearophile should convert it into a blog post/article. I think reddit would love it. The suggestions included (such as enumerate()) are also very worth looking into.

I agree, it's terrific. But perhaps we can just submit it to reddit as is?
Apr 02 2013
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 4/2/13 4:59 PM, Andrei Alexandrescu wrote:
 On 4/2/13 6:04 AM, bearophile wrote:
 Jonas Drewsen:
 Article about the expressiveness of languages with D included as one
 of the contestants.

 http://redmonk.com/dberkholz/2013/03/25/programming-languages-ranked-by-expressiveness/

I think D is quite expressive: http://forum.dlang.org/thread/zdhfpftodxnvbpwvklcv forum.dlang.org Bye, bearophile

I meant to comment on this - it's a terrific walkthrough. I think bearophile should convert it into a blog post/article. I think reddit would love it. The suggestions included (such as enumerate()) are also very worth looking into.

Pinging bearophile on this again - do you want to adapt this into a blog entry? It may be worth posting the link to reddit as is, but one adaptation pass for a larger audience shouldn't hurt. Let us know! Andrei
Apr 04 2013
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 4/4/13 10:36 PM, Zach the Mystic wrote:
 On Friday, 5 April 2013 at 01:55:06 UTC, bearophile wrote:
 Andrei Alexandrescu:
 Pinging bearophile on this again - do you want to adapt this into a
 blog entry? It may be worth posting the link to reddit as is, but one
 adaptation pass for a larger audience shouldn't hurt.

 Let us know!

Thank you for your interest. I like to write articles, but there are significant problems related to that post: - It's a soup of very different things; - It suggests things like stream fusion that I think aren't yet discussed in the D community; - I think it's not good for consumption outside the D community, it focuses on details mostly important for the development of D/Phobos; - I think some of its contents are half cooked and need some more of my reflection; - I do not like to show a text two times. Bye, bearophile

I just wanted to say that I also liked the article and I understand why the others would want you to repost it. I think the strengths outweigh the weaknesses you mention, but I do understand nto wanting to show the thing twice.

I, too, understand that, with the amendment that it's an unwarranted concern. I used to worry about that, too (e.g. not give the same talk twice) until I understood that the overlap in audiences is very small, and the people comprising the overlap understand and approve of the reasons for repeating. Andrei
Apr 05 2013
prev sibling next sibling parent "renoX" <renozyx gmail.com> writes:
On Tuesday, 2 April 2013 at 07:59:17 UTC, Jonas Drewsen wrote:
 Article about the expressiveness of languages with D included 
 as one of the contestants.

 http://redmonk.com/dberkholz/2013/03/25/programming-languages-ranked-by-expressiveness/

 I tend to agree with the first comment to the article though :)

 /Jonas

Yep, the sorting seems quite random to me, AFAIK Vala is nothing special yet it is ranked very high in this article.. renoX
Apr 02 2013
prev sibling next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 04/02/2013 03:24 PM, renoX wrote:
 Yep, the sorting seems quite random to me, AFAIK Vala is nothing special yet it
 is ranked very high in this article..

To be fair, the author does say that results for what he calls "third tier" languages (like Vala) should be considered with a great deal of skepticism: http://redmonk.com/dberkholz/2013/03/26/what-does-expressiveness-via-loc-per-commit-measure-in-practice/
Apr 02 2013
prev sibling next sibling parent "Peter Alexander" <peter.alexander.au gmail.com> writes:
On Tuesday, 2 April 2013 at 17:33:13 UTC, Walter Bright wrote:
 On 4/2/2013 2:53 AM, Joseph Rushton Wakeling wrote:
 I also have a strong feeling that LOC per commit reflects too 
 many different
 factors to be really reliable as a comparison, e.g. it 
 probably depends quite
 strongly on the age/maturity of a project, the rate of 
 development, and other
 factors.

Consider also that this LOC numbers are not lines of code - they're also lines of comments! D's ddoc encourages writing considerably more lines of comments than C does.

Not to mention that idiomatic D bracing style adds more lines.
Apr 02 2013
prev sibling next sibling parent "Jesse Phillips" <Jessekphillips+D gmail.com> writes:
On Tuesday, 2 April 2013 at 17:33:13 UTC, Walter Bright wrote:
 On 4/2/2013 2:53 AM, Joseph Rushton Wakeling wrote:
 I also have a strong feeling that LOC per commit reflects too 
 many different
 factors to be really reliable as a comparison, e.g. it 
 probably depends quite
 strongly on the age/maturity of a project, the rate of 
 development, and other
 factors.

Consider also that this LOC numbers are not lines of code - they're also lines of comments! D's ddoc encourages writing considerably more lines of comments than C does.

While I don't know what this specific report used, but comments are generally factored out of LOC and have their own count. I usually find the build in unittests to cause more skew since those are counted as LOC.
Apr 02 2013
prev sibling next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Tue, Apr 02, 2013 at 05:01:32PM -0700, Walter Bright wrote:
 On 4/2/2013 4:55 PM, Jesse Phillips wrote:
I usually find the build in unittests to cause more skew since those
are counted as LOC.

Often, in pulls for D, the LOC of the unittests exceeds the LOC of the fix. I'm inordinately pleased with how well unittests have become embedded in our D culture.

And I'm inordinately pleased with how many careless mistakes have been caught by unittests in my D code while coding, as opposed to afterwards when I'm actually using the program for something and bugs show up. T -- Тише едешь, дальше будешь.
Apr 02 2013
prev sibling next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Tuesday, April 02, 2013 17:01:32 Walter Bright wrote:
 On 4/2/2013 4:55 PM, Jesse Phillips wrote:
 I usually find the build in unittests to cause more skew since those are
 counted as LOC.

Often, in pulls for D, the LOC of the unittests exceeds the LOC of the fix. I'm inordinately pleased with how well unittests have become embedded in our D culture.

Yes, though I've had complaints before about a pull being too much code where the unit tests were considered part of the code, and the reviewer thought that number of lines was too great to be worth adding, even if the number of lines of normal code was relatively small. And that sort of attitude would just lead to not properly unit testing stuff. And while we do some great unit testing (the built in unit test feature is a _huge_ success in that regard), there are at least some areas where we really need to step up our game on that (with ranges in particular given all of the variations of them there are and how many static if branches many range-based functions have). So, what we've got is great, but we can do better. - Jonathan M Davis
Apr 02 2013
prev sibling next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Tuesday, April 02, 2013 22:44:15 Andrei Alexandrescu wrote:
 On 4/2/13 10:13 PM, Jonathan M Davis wrote:
 On Tuesday, April 02, 2013 17:01:32 Walter Bright wrote:
 On 4/2/2013 4:55 PM, Jesse Phillips wrote:
 I usually find the build in unittests to cause more skew since those are
 counted as LOC.

Often, in pulls for D, the LOC of the unittests exceeds the LOC of the fix. I'm inordinately pleased with how well unittests have become embedded in our D culture.

Yes, though I've had complaints before about a pull being too much code where the unit tests were considered part of the code, and the reviewer thought that number of lines was too great to be worth adding, even if the number of lines of normal code was relatively small. And that sort of attitude would just lead to not properly unit testing stuff.

I think it leads to writing less repetitive unittests. If we did datetime all over again, I'd give a budget of 2000 lines for all functionality. I bet the solution would be better.

I very much doubt that you could do that unless you specifically formatted the code to take up as few lines as possible and didn't count the unit tests or documentation in that line count. Otherwise, you couldn't do anything even close to what std.datetime does in that few lines. Sure, some functionality could be stripped, but you'd end up with something that did a lot less if it were that small. The unit tests and documentation do make it seem like a lot more code than it is, since they take up well over half the file (probably 3/4), but you'd definitely lose functionality with that few lines of code, and you'd end up with something very poor IMHO if those 2000 lines included the documentation and unit tests. You'd either end up with something that was very bare-bones and/or something which was poorly tested, and given how easy it is to screw up some of those date/time calculations, having only a few tests would be a very bad idea. std.datetime's unit tests do need some refactoring (some of which I've done, but there's still a fair bit of work to do there), which will definitely reduce the number of LOC that they take up, but I don't agree at all with considering the unit tests as part of the LOC of file when discussing keeping LOC to a minimum. And while it's good to avoid repetitive unit tests, I'd much rather have repetitive unit tests which are thorough than short ones which aren't. I find your focus on trying to keep unit tests to a minimum to be disturbing and likely to lead to poorly tested code. If anything, we need to be more thorough, not less. That doesn't mean that the tests need to look like what std.datetime has (particularly since I purposefully avoided loops and other more complicated constructs when I wrote them originally in order to make them as simple and as far from error-prone as possible), but unit tests need to be thorough, and while we're getting better, Phobos' unit tests frequently aren't thorough enough (particularly in std.range and std.algorithm when it comes to testing a variety of range types). Too many of them just test a few cases to make sure that the most obvious stuff works rather than making sure they test corner cases and whatnot. - Jonathan M Davis
Apr 02 2013
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 4/3/13, Jacob Carlborg <doob me.com> wrote:
 The problem is having the unit tests in the same file. Yes, I know, most
 of you love it, I don't.

One thing I noticed is that having unittests in separate files can catch issues with template mixins. If you have any private or protected functions that are used by a mixin template, the mixin template will not compile once the user tries to use it in his own code. There are workarounds, of course, like putting functions inside of the template. But the point still stands that you need to also test the library externally. Another thing local unittests don't test are symbol clashes. If a user imports lib.a and lib.b from your library, he probably doesn't expect to get symbol clashes. In fact Phobos has had symbol clashes before, and we're working on getting rid of them (e.g. through deprecation stages). But if Phobos also had external test-cases then we could have avoided symbol clashes to begin with.
Apr 02 2013
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 4/3/13, Andrej Mitrovic <andrej.mitrovich gmail.com> wrote:
 If you have any private or protected functions

I meant private or package.
 One thing I noticed is that having unittests in separate files can
 catch issues with template mixins.

I wonder if there's a way to mitigate that problem with a language feature. Perhaps marking the unittest as 'extern' would make the unittest only have access to public symbols in the module. That way you never get into the situation where testing something from within a unittest seems to work, but completely forgetting that you're calling a private or package function.
Apr 02 2013
prev sibling next sibling parent "Peter Alexander" <peter.alexander.au gmail.com> writes:
On Wednesday, 3 April 2013 at 02:44:15 UTC, Andrei Alexandrescu 
wrote:
 If we did datetime all over again, I'd give a budget of 2000 
 lines for all functionality. I bet the solution would be better.

I think you are massively underestimating the complexity and subtleties of dates and time. For comparison, min and max in std.algorithm come to nearly 200 lines on their own, and their unittests are hopelessly lacking. Things like min(uint.min, int.max) are not tested, even though there's specific code to handle them. To suggest that date and time handling is a mere 10x more complex than min/max is a bit naive in my opinion.
Apr 03 2013
prev sibling next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Tuesday, April 02, 2013 20:41:23 Walter Bright wrote:
 On 4/2/2013 8:03 PM, Jonathan M Davis wrote:
 Too many of them just test a few cases to make sure that the most
 obvious stuff works rather than making sure they test corner cases and
 whatnot.

suggest low hanging fruit. Despite what I just said, datetime has one of the highest unittest coverages of any phobos module. Pretty much all of the phobos module unittest coverage testing indicates more work is needed. Minor perf improvement: the order of the tests in yearIsLeapYear() should be reversed, especially since signed divide is a very slow operation, and it is called 20 million times by the unittests!!!

Yes. That's one of the things that I need to improve. std.datetime has a lot of tests, so it needs to do a better job of ordering stuff within unittest blocks in a manner which minimizes their cost. They need to be thorough, but they should also efficient, or the tests will end up taking too long (which is why it doensn't do a lot of testing with exceptions right now, since they slow the tests down considerably). - Jonathan M Davis
Apr 03 2013
prev sibling next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Wednesday, April 03, 2013 13:37:40 Andrei Alexandrescu wrote:
 On 4/2/13 11:03 PM, Jonathan M Davis wrote:
 I
 find your focus on trying to keep unit tests to a minimum to be disturbing
 and likely to lead to poorly tested code.

Well that's quite the assumption.

If you push for the lines of unit testing code to be kept to a minimum, I don't see how you can possibly expect stuff to be thoroughly tested. There are times that better written tests take up less space, but testing isn't free, and if anything, we need more of it, not less, if we want to make sure that all of Phobos works correctly. And on multiple occasions now, you've balked at what I would consider to be properly thorough unit tests and wanted them to be reduced in size. And since that generally means testing fewer things, I think that it's pretty much a sure thing that it's generally going to lead to poorer testing and increase the risk of code being buggy. - Jonathan M Davis
Apr 03 2013
prev sibling next sibling parent "Brad Anderson" <eco gnuk.net> writes:
On Wednesday, 3 April 2013 at 17:08:57 UTC, Andrei Alexandrescu 
wrote:
 On 4/3/13 11:55 AM, Peter Alexander wrote:
 On Wednesday, 3 April 2013 at 02:44:15 UTC, Andrei 
 Alexandrescu wrote:
 If we did datetime all over again, I'd give a budget of 2000 
 lines for
 all functionality. I bet the solution would be better.

I think you are massively underestimating the complexity and subtleties of dates and time.

May as well. I recall before I approved std.datetime I looked at the implementation sizes of similar functionality in other languages; they were all rather bulky, but std.datetime was at the high end of the range.

Boost datetime is 27k. Just the headers comes to 17k. A 2k budget for a date time library is unreasonable unless you don't want anyone using D for anything serious involving dates and times. They are complex and require a lot of code to get right. Perhaps 34k is too large but 2k is laughable.
Apr 03 2013
prev sibling next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Wednesday, April 03, 2013 19:59:37 Brad Anderson wrote:
 Perhaps 34k is too large but 2k is laughable.

I really should strip out the unit tests and documentation to see what the line count of actual code is, as something like 75% of that is unit tests and documentation, and IIRC, std.datetime provides most of the functionality that Boost does plus some more, though it does some weird, complicated stuff with its header files from what I recall. I'd hate to be the maintainer of Boost's datetime stuff. - Jonathan M Davis
Apr 03 2013
prev sibling next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Tuesday, April 02, 2013 20:41:23 Walter Bright wrote:
 Currently, the datetime unittest coverage is 95%. Some of the 0 cases
 suggest low hanging fruit.

I should take another look at those. I thought that I had it at more like 98% (with most or all of the missing lines being due to stuff like catching Exception and asserting 0 in the catch block for making a function nothrow when you know that the code being called will never throw), but that was quite a while ago, and it sounds like it's now missing some stuff. I'm very much in favor of having 100% test coverage on every line that _can_ be tested (there may be rare exceptions to that, but I don't think that std.datetime has any of them). - Jonathan M Davis
Apr 03 2013
prev sibling next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Wednesday, April 03, 2013 08:53:09 Jacob Carlborg wrote:
 On 2013-04-03 05:03, Jonathan M Davis wrote:
 I very much doubt that you could do that unless you specifically formatted
 the code to take up as few lines as possible and didn't count the unit
 tests or documentation in that line count. Otherwise, you couldn't do
 anything even close to what std.datetime does in that few lines. Sure,
 some functionality could be stripped, but you'd end up with something
 that did a lot less if it were that small. The unit tests and
 documentation do make it seem like a lot more code than it is, since they
 take up well over half the file (probably 3/4), but you'd definitely lose
 functionality with that few lines of code, and you'd end up with
 something very poor IMHO if those 2000 lines included the documentation
 and unit tests. You'd either end up with something that was very
 bare-bones and/or something which was poorly tested, and given how easy
 it is to screw up some of those date/time calculations, having only a few
 tests would be a very bad idea.

Since he wrote "2000 lines for all functionality", I don't think he included unit tests or docs/comments.

That may be, but he does seem to have a habit of including the unit tests in the line count when he doesn't like how many lines of code a new piece of functionality takes up.
 std.datetime's unit tests do need some refactoring (some of which I've
 done, but there's still a fair bit of work to do there), which will
 definitely reduce the number of LOC that they take up, but I don't agree
 at all with considering the unit tests as part of the LOC of file when
 discussing keeping LOC to a minimum. And while it's good to avoid
 repetitive unit tests, I'd much rather have repetitive unit tests which
 are thorough than short ones which aren't. I find your focus on trying to
 keep unit tests to a minimum to be disturbing and likely to lead to
 poorly tested code.
 
 If anything, we need to be more thorough, not less. That doesn't mean that
 the tests need to look like what std.datetime has (particularly since I
 purposefully avoided loops and other more complicated constructs when I
 wrote them originally in order to make them as simple and as far from
 error-prone as possible), but unit tests need to be thorough, and while
 we're getting better, Phobos' unit tests frequently aren't thorough
 enough (particularly in std.range and std.algorithm when it comes to
 testing a variety of range types). Too many of them just test a few cases
 to make sure that the most obvious stuff works rather than making sure
 they test corner cases and whatnot.
 
 - Jonathan M Davis

I actually prefer to have repetitive unit tests and not using loops to make it clear what they actually do. Here's an example from our code base, in Ruby: describe "Swedish" do subject { build(:address) { |a| a.country_id = Country::SWEDEN } } it { should validate_postal_code(12345) } it { should validate_postal_code(85412) } it { should_not validate_postal_code(123) } it { should_not validate_postal_code(123456) } it { should_not validate_postal_code("05412") } it { should_not validate_postal_code("fooba") } end describe "Finnish" do subject { build(:address) { |a| a.country_id = Country::FINLAND } } it { should validate_postal_code(12345) } it { should validate_postal_code(12354) } it { should validate_postal_code(41588) } it { should validate_postal_code("00123") } it { should validate_postal_code("01588") } it { should validate_postal_code("00000") } it { should_not validate_postal_code(1234) } it { should_not validate_postal_code(123456) } it { should_not validate_postal_code("fooba") } end It could be written less repetitive, like this: postal_codes = { Country::SWEDEN => { valid: [12345, 85412], invalid: [123, 123456, "05412", "fooba"] }, Country::FINLAND => { valid: [12345, 12354, 41588], invalid: ["00123", "01588", "00000", 1234, 123456, "fooba"] } } postal_codes.each do |country_id, postal_codes| describe c.english_name do subject { build(:address) { |a| a.country_id = country_id } } postal_codes[:valid].each do |postal_code| it { should validate_postal_code(postal_code) } end postal_codes[:invalid].each do |postal_code| it { should_not validate_postal_code(postal_code) } end end end But I don't think that looks any better. I think it's much worse.

In general, I agree, because I think that straight-forward tests that avoid loops and the like are far less error-prone, and you need the tests to not be buggy. I don't want to have to test my test code to make sure that it works correctly. However, I _do_ think that there's something to be said for refactoring the tests later (after the code supposedly fully works) to use loops and other more complicated constructs, because not only can that lead to more compact tests, but it also makes it much easier to make the tests more thorough (without taking many more lines of code). I just think that _starting out_ with the more complicated tests is not necessarily a good idea. Treating unit testing code as if it were the same is normal code doesn't make sense to me, if nothing else, because that would indicate that you're going to have to test your test code, since normal code is complicated enough to require testing. But Andrei and I have argued about this before, and I don't expect us to agree ever on it. - Jonathan M Davis
Apr 03 2013
prev sibling next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Wednesday, April 03, 2013 11:27:38 Walter Bright wrote:
 On 4/3/2013 11:08 AM, Jonathan M Davis wrote:
 (with most or all of the missing lines being due to stuff like catching
 Exception and asserting 0 in the catch block for making a function nothrow
 when you know that the code being called will never throw)

Why not just mark them as nothrow? Let the compiler statically check it.

It's for cases where the compiler _can't_ check. For instance, if you had code like string foo(int i, int j) nothrow { try return format("%s: %s", i, j); catch(Exception e) assert(0, "format threw when it should have been impossible."); } the catch is necessary in order to mark the function as nothrow, because format _could_ throw. It's just that given the arguments, you know that it never will. - Jonathan M Davis
Apr 03 2013
prev sibling next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Wednesday, April 03, 2013 11:29:53 Walter Bright wrote:
 On 4/3/2013 11:08 AM, Jonathan M Davis wrote:
 I'm very much in
 favor of having 100% test coverage on every line that _can_ be tested
 (there may be rare exceptions to that, but I don't think that
 std.datetime has any of them).

I'd be shocked if running -cov for the first time *didn't* come up with issues.

Yes. My point was that 100% should be the goal, whereas I know a number of developers who consider something like 70% to be sufficient - and these are folks who actually believe in writing unit tests. Certainly, expecting to hit 100% with -cov on the first try isn't generally very realistic unless you're always extremely thorough with your tests, and even then, it's easy to miss a line or two on rarer branches, especially as functions become more complex. - Jonathan M Davis
Apr 03 2013
prev sibling next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Wednesday, April 03, 2013 11:36:54 Walter Bright wrote:
 On 4/3/2013 10:58 AM, Jonathan M Davis wrote:
 If you push for the lines of unit testing code to be kept to a minimum, I
 don't see how you can possibly expect stuff to be thoroughly tested.

My idea of perfection would be 100% coverage with zero redundancy in the unittests. In my experience with testing, the technique of "quantity has a quality all its own" style of testing does not produce adequate test coverage - it just simply takes a lot of time to run (which makes it less useful, as one then tends to avoid running them).

Well, determining what's actually redundant isn't always easy. If a test is clearly redundant, then it makes sense to remove it, but if you're not careful with that (especially if you're basing your tests off of what the current code looks like), then it can be easy to remove tests which were basically unnecessary with the current implementation but which would have caught bugs when the code was refactored. So, while in principle, I agree that having zero redundancy would be good, in practice, I don't think that it's that straightforward. I also don't think that code coverage means much beyond the fact that if you don't have 100% (minus the lines of code that can never be hit - e.g. assert(0);), then obviously some stuff isn't tested properly. You need to hit all of the corner cases and whatnot which may not work correctly yet or which may get broken when refactoring, and often, 100% test coverage doesn't get you there, much as it's an important milestone. Certainly, I agree that having the minimal tests required to test everything that needs testing should be the goal, but figuring out which tests are and aren't really needed is a bit of art. Personally, I do tend to err on the side of over-testing rather than under-testing though, as that does a better job of ensuring that the code is correct. Actually, I'd argue that in perfect world, you'd test absolutely every possible input to make sure that it had the correct output, but that's obviously impossible in all but the most simplistic code, and actually attempting that approach just leads to unit tests which take too long to run. - Jonathan M Davis
Apr 03 2013
prev sibling next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Wednesday, April 03, 2013 11:58:20 Walter Bright wrote:
 On 4/3/2013 11:44 AM, Jonathan M Davis wrote:
 Yes. My point was that 100% should be the goal, whereas I know a number of
 developers who consider something like 70% to be sufficient - and these
 are
 folks who actually believe in writing unit tests. Certainly, expecting to
 hit 100% with -cov on the first try isn't generally very realistic unless
 you're always extremely thorough with your tests, and even then, it's
 easy to miss a line or two on rarer branches, especially as functions
 become more complex.

unit tests.

Good point. That's not something that I typically think of - though in a lot of cases (for me personally at least), I think that the greater risk would be functions which weren't called at all by other code but _were_ properly tested, and -cov wouldn't catch that. But finding dead code with cov is definitely something to remember. I should cov more often anyway. Too often, given how thorough I generally am with unit tests, I tend to assume that the code coverage is there - and it probably is, but it's best to be sure. - Jonathan M Davis
Apr 03 2013
prev sibling next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Wednesday, April 03, 2013 12:03:39 Walter Bright wrote:
 On 4/3/2013 11:56 AM, Jonathan M Davis wrote:
 Certainly, I agree that having the minimal tests required to test
 everything that needs testing should be the goal, but figuring out which
 tests are and aren't really needed is a bit of art.

That's why we are engineers, and not mere code monkeys.

True.
 Actually, I'd argue that in perfect world, you'd test absolutely every
 possible input to make sure that it had the correct output, but that's
 obviously impossible in all but the most simplistic code,

We can exploit mathematics to reduce the test cases while testing thoroughly. In physics I learned to test one's solution with the boundary cases and a couple of known cases. Mathematically, that was sufficient.

Definitely, though in some cases, figuring the bounds cases can be quite tricky - e.g. as thorough as std.datetime's unit tests are, I still missed some in one instance and got a bug report early on for that (though on the whole, there have been very few bugs reported on std.datetime, so I think that the unit tests have been quite effective). But getting good at figuring that sort of thing out _is_ part of our job description. - Jonathan M Davis
Apr 03 2013
prev sibling next sibling parent "Simen Kjaeraas" <simen.kjaras gmail.com> writes:
On 2013-04-03, 20:04, Jonathan M Davis wrote:

 On Wednesday, April 03, 2013 19:59:37 Brad Anderson wrote:
 Perhaps 34k is too large but 2k is laughable.

I really should strip out the unit tests and documentation to see what the line count of actual code is, as something like 75% of that is unit tests and documentation, and IIRC, std.datetime provides most of the functionality that Boost does plus some more, though it does some weird, complicated stuff with its header files from what I recall. I'd hate to be the maintainer of Boost's datetime stuff.

Removed all comments, unittests, and empty lines from std.datetime. File went from 34070 to 5843 lines. -- Simen
Apr 03 2013
prev sibling next sibling parent "Jesse Phillips" <Jesse.K.Phillips+D gmail.com> writes:
On Wednesday, 3 April 2013 at 19:28:56 UTC, Simen Kjaeraas wrote:
 On 2013-04-03, 20:04, Jonathan M Davis wrote:

 On Wednesday, April 03, 2013 19:59:37 Brad Anderson wrote:
 Perhaps 34k is too large but 2k is laughable.

I really should strip out the unit tests and documentation to see what the line count of actual code is, as something like 75% of that is unit tests and documentation, and IIRC, std.datetime provides most of the functionality that Boost does plus some more, though it does some weird, complicated stuff with its header files from what I recall. I'd hate to be the maintainer of Boost's datetime stuff.

Removed all comments, unittests, and empty lines from std.datetime. File went from 34070 to 5843 lines.

cloc doesn't support /+ comments... But using your number, cloc, and some math loc: 5843 comments: 6255 unittest: 16503 blank: 5469
Apr 03 2013
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 03 Apr 2013 14:42:12 -0400, Walter Bright  
<newshound2 digitalmars.com> wrote:

 On 4/3/2013 9:49 AM, Dmitry Olshansky wrote:
 +1

Stylistic nit: When writing a one-liner post like this, please do not quote the entire preceding post, especially if it is long. We have great forum software, and the newsreaders as well are great at navigating the threads.

I couldn't disagree more. The given +1 had 4 lines of context. There was some straggling text after it, but this was only an additional 5 lines. My newsreader highlights replied-to text in different colors depending on the level of indent. I can immediately pick out new replies, and if I don't want to read the re-posted stuff, I don't have to, unless I want to for context. Newsreaders are known not to thread things properly, and some people's posts don't thread properly ANYWHERE. Context is important.
 Not to pick on you, but I see this a lot here from many of our  
 participants and finally felt compelled to speak up!

I find posts that are solely about how you didn't "post properly" annoying. Kind of like compulsively telling someone they didn't use correct grammar (for which I have to fight my instincts in order to remain married). Sorry, I had to say something ;) -Steve
Apr 03 2013
prev sibling next sibling parent "Jesse Phillips" <Jessekphillips+D gmail.com> writes:
On Thursday, 4 April 2013 at 14:31:36 UTC, Jacob Carlborg wrote:
 On 2013-04-04 03:47, Jesse Phillips wrote:

 cloc doesn't support /+ comments... But using your number, 
 cloc, and
 some math

std.datetime contains mostly /+ and // comments. It only contains a single /* comment.

I realize that, reason I had to use math. Cloc reports 11598 (something near that) then I know subtracted the actual loc gives me the /+ comments.
Apr 04 2013
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 04 Apr 2013 09:25:30 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 4/3/13 11:24 PM, Steven Schveighoffer wrote:
 On Wed, 03 Apr 2013 14:42:12 -0400, Walter Bright
 <newshound2 digitalmars.com> wrote:

 On 4/3/2013 9:49 AM, Dmitry Olshansky wrote:
 +1

Stylistic nit: When writing a one-liner post like this, please do not quote the entire preceding post, especially if it is long. We have great forum software, and the newsreaders as well are great at navigating the threads.

I couldn't disagree more. The given +1 had 4 lines of context. There was some straggling text after it, but this was only an additional 5 lines.

I'm with Walter. The top context was fine for that message. The bottom was not seeing as the poster had nothing to say about it. Deleting the bottom is good common courtesy. Walter himself used to leave vast amounts of trailing context in our communication, and it saved me significant time when he started to consistently trim it. With trailing chaff, essentially every reader needs to scroll down to find "is there anything more this guy wanted to add"? Some don't even insert an empty line.

Mac mail fixed this problem for me. All previously received text is folded out, no need to look at it.
 My newsreader highlights replied-to text in different colors depending
 on the level of indent. I can immediately pick out new replies, and if I
 don't want to read the re-posted stuff, I don't have to, unless I want
 to for context.

Mine too, but that doesn't make the problem go away.

It doesn't? It pretty much fixes it for me. I can see exactly what the new text is via it's color.
 Newsreaders are known not to thread things properly, and some people's
 posts don't thread properly ANYWHERE. Context is important.

Yes, just not trailing chaff.

I agree, it's not necessary. But it's not worth a public scolding either.
 Not to pick on you, but I see this a lot here from many of our
 participants and finally felt compelled to speak up!

I find posts that are solely about how you didn't "post properly" annoying. Kind of like compulsively telling someone they didn't use correct grammar (for which I have to fight my instincts in order to remain married). Sorry, I had to say something ;)

Such posts are good because netiquette is not as widespread and as agreed upon as grammar.

Such posts are annoying precisely because there is no agreed upon netiquette. There is no "Right way" to post. It's actually kind of ironic that grammar is NOT policed here as much, simply because we all agree to post in English, and that's not always the author's native language. -Steve
Apr 04 2013
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Andrei Alexandrescu:

 The suggestions included (such as enumerate()) are also very 
 worth
 looking into.


I think the enumerate() was discussed mostly elsewhere.
 Pinging bearophile on this again - do you want to adapt this 
 into a blog entry? It may be worth posting the link to reddit 
 as is, but one adaptation pass for a larger audience shouldn't 
 hurt.

 Let us know!

Thank you for your interest. I like to write articles, but there are significant problems related to that post: - It's a soup of very different things; - It suggests things like stream fusion that I think aren't yet discussed in the D community; - I think it's not good for consumption outside the D community, it focuses on details mostly important for the development of D/Phobos; - I think some of its contents are half cooked and need some more of my reflection; - I do not like to show a text two times. Bye, bearophile
Apr 04 2013
prev sibling next sibling parent "Zach the Mystic" <reachzach gggggmail.com> writes:
On Friday, 5 April 2013 at 01:55:06 UTC, bearophile wrote:
 Andrei Alexandrescu:
 Pinging bearophile on this again - do you want to adapt this 
 into a blog entry? It may be worth posting the link to reddit 
 as is, but one adaptation pass for a larger audience shouldn't 
 hurt.

 Let us know!

Thank you for your interest. I like to write articles, but there are significant problems related to that post: - It's a soup of very different things; - It suggests things like stream fusion that I think aren't yet discussed in the D community; - I think it's not good for consumption outside the D community, it focuses on details mostly important for the development of D/Phobos; - I think some of its contents are half cooked and need some more of my reflection; - I do not like to show a text two times. Bye, bearophile

I just wanted to say that I also liked the article and I understand why the others would want you to repost it. I think the strengths outweigh the weaknesses you mention, but I do understand nto wanting to show the thing twice.
Apr 04 2013
prev sibling next sibling parent "SomeDude" <lovelydear mailmetrash.com> writes:
On Tuesday, 2 April 2013 at 23:55:19 UTC, Jesse Phillips wrote:
 On Tuesday, 2 April 2013 at 17:33:13 UTC, Walter Bright wrote:
 On 4/2/2013 2:53 AM, Joseph Rushton Wakeling wrote:
 I also have a strong feeling that LOC per commit reflects too 
 many different
 factors to be really reliable as a comparison, e.g. it 
 probably depends quite
 strongly on the age/maturity of a project, the rate of 
 development, and other
 factors.

Consider also that this LOC numbers are not lines of code - they're also lines of comments! D's ddoc encourages writing considerably more lines of comments than C does.

While I don't know what this specific report used, but comments are generally factored out of LOC and have their own count. I usually find the build in unittests to cause more skew since those are counted as LOC.

He certainly didn't factor out comments for all languages, meaning that he didn't do it at all.
Apr 04 2013
prev sibling next sibling parent "SomeDude" <lovelydear mailmetrash.com> writes:
On Wednesday, 3 April 2013 at 18:42:14 UTC, Walter Bright wrote:
 On 4/3/2013 9:49 AM, Dmitry Olshansky wrote:
 +1

Stylistic nit: When writing a one-liner post like this, please do not quote the entire preceding post, especially if it is long. We have great forum software, and the newsreaders as well are great at navigating the threads.

+1 I hate it to have to scroll down just to read a one liner that nearly adds nothing to a long post. It gives an impression of laziness from the part of the author.
Apr 04 2013
prev sibling next sibling parent "SomeDude" <lovelydear mailmetrash.com> writes:
On Thursday, 4 April 2013 at 18:00:27 UTC, Steven Schveighoffer 
wrote:
 Mac mail fixed this problem for me.  All previously received 
 text is folded out, no need to look at it.

So there is a lot of visual noise for nothing, and you like it ? And what if one uses the web forum, like me ? Or Thunderbird ? Do we need to buy a mac and use your newsreader ? Seriously, the netiquette *demands* that you trim previous mails to keep only the necessary. If everybody was doing like you, we would end up having posts hundreds of lines long, most of which being noise.
Apr 04 2013
prev sibling next sibling parent "Kagamin" <spam here.lot> writes:
 It won’t tell you how readable the resulting code is (Hello, 
 lambda functions) or how long it takes to write it (APL 
 anyone?), so it’s not a measure of maintainability or 
 productivity.

Did I get it right, that expressiveness means trading maintainability for keystroke saving?
Apr 05 2013
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 05 Apr 2013 02:16:02 -0400, SomeDude <lovelydear mailmetrash.com>  
wrote:

 On Thursday, 4 April 2013 at 18:00:27 UTC, Steven Schveighoffer wrote:
 Mac mail fixed this problem for me.  All previously received text is  
 folded out, no need to look at it.

So there is a lot of visual noise for nothing, and you like it ?

I like that I don't have to deal with it. I also don't have to deal with it if the person deletes the replied-to text. In other words, it takes all forms, and gives me what I need to read.
 And what if one uses the web forum, like me ? Or Thunderbird ? Do we  
 need to buy a mac and use your newsreader ?

No, I'm just stating that I don't have that problem. That is with email though, mac mail doesn't do newsgroups. It's not a solution for you, it's just that I realized I don't have to ever deal with this anymore, which I hadn't thought about.
 Seriously, the netiquette *demands* that you trim previous mails to keep  
 only the necessary.

There is no technical requirement for this. I don't think any of this would be grounds for banning here, so as long as you get your point across, I don't see a problem. There is the notion that if you make your posts annoying to read, less people will read them. But for this specific instance, I found 9 lines of context not to be a burden, even though 5 lines were unneeded.
 If everybody was doing like you, we would end up having posts hundreds  
 of lines long, most of which being noise.

I typically trim down my posts to the relevant information. I do it because it makes my point come across much better. -Steve
Apr 05 2013
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 05 Apr 2013 13:13:29 -0400, Chad Joan <chadjoan gmail.com> wrote:

 My problem with datetime is that it is too monolithic.  I really wish it  
 was split into about 3 different modules.  This is frustrating from a  
 user-perspective.  The docs for that thing can easily make someone's  
 eyes gloss over.

What if the docs were split up? E.g. http://vibed.org/temp/d-programming-language.org/phobos/std/datetime.html -Steve
Apr 05 2013
prev sibling next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Friday, April 05, 2013 13:13:29 Chad Joan wrote:
 On 04/02/2013 10:44 PM, Andrei Alexandrescu wrote:
 I think it leads to writing less repetitive unittests.
 
 If we did datetime all over again, I'd give a budget of 2000 lines for
 all functionality. I bet the solution would be better.
 
 
 Andrei

My problem with datetime is that it is too monolithic. I really wish it was split into about 3 different modules. This is frustrating from a user-perspective. The docs for that thing can easily make someone's eyes gloss over. If you split it up, then the LOC per module would become smaller too, as a side-effect.

If/Once some variant of DIPs 15 or 16 is implemented, we'll be able to transparently turn modules into packages - making the package have the same name as the old module and split what was in the old module across multiple modules in the new package. Code will then work exactly as before, importing the package as it were a module but allowing you to import the modules in the package directly in new code if you want to. Then we'll be able to split up larger modules like std.algorithm or std.datetime if we want to - without breaking anyone's code. Once that's available, I have every intention of splitting up std.datetime into separate modules, but doing so before that would break code, which I don't want to do. - Jonathan M Davis
Apr 05 2013
prev sibling next sibling parent Brad Roberts <braddr puremagic.com> writes:
On 4/5/13 11:17 AM, Jonathan M Davis wrote:
 On Friday, April 05, 2013 13:13:29 Chad Joan wrote:
 On 04/02/2013 10:44 PM, Andrei Alexandrescu wrote:
 I think it leads to writing less repetitive unittests.

 If we did datetime all over again, I'd give a budget of 2000 lines for
 all functionality. I bet the solution would be better.


 Andrei

My problem with datetime is that it is too monolithic. I really wish it was split into about 3 different modules. This is frustrating from a user-perspective. The docs for that thing can easily make someone's eyes gloss over. If you split it up, then the LOC per module would become smaller too, as a side-effect.

If/Once some variant of DIPs 15 or 16 is implemented, we'll be able to transparently turn modules into packages - making the package have the same name as the old module and split what was in the old module across multiple modules in the new package. Code will then work exactly as before, importing the package as it were a module but allowing you to import the modules in the package directly in new code if you want to. Then we'll be able to split up larger modules like std.algorithm or std.datetime if we want to - without breaking anyone's code. Once that's available, I have every intention of splitting up std.datetime into separate modules, but doing so before that would break code, which I don't want to do. - Jonathan M Davis

I believe it's really not a module issue at all, but a doc issue. The two are directly tied today, but I have _no_ problem with importing the module and using it as is. Yes, it's large in terms of lines in the file, but really, who's affected by that and how often. Few and seldom. Breaking it up just because of docs is like ripping a book into 10 books just because you want to only carry one chapter around.
Apr 05 2013
prev sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Friday, April 05, 2013 14:36:07 Brad Roberts wrote:
 I believe it's really not a module issue at all, but a doc issue. The
 two are directly tied today, but I have _no_ problem with importing the
 module and using it as is. Yes, it's large in terms of lines in the
 file, but really, who's affected by that and how often. Few and seldom.
 Breaking it up just because of docs is like ripping a book into 10
 books just because you want to only carry one chapter around.

To some extent, I agree. I'm quite able to maintain it as one module (though to be fair to anyone arguing that it should be broken up for maintainibility - as sometimes happens - it's large enough that if large portions of it get changed, you can't see the diff on github). I'm not sure that it would _hurt_ maintainibility though to break it up. And I know exactly how I'd break it up if I were to break it up, and it would break up quite cleanly, I think. The main reason that it's not broken up in the first place is that I did a horrible job of breaking it up when I first introduced it, and everyone's reaction was that it should just be one module (the code has changed quite a bit since then though, so breaking it up would be much easier now). But regardless, with ddoc, breaking up the module would be the only way to break up the documentation, so we're kind of stuck in that regard (though if we start using ddox for dlang.org, that does change things). - Jonathan m Davis
Apr 05 2013