digitalmars.D.learn - [improve-it] Parsing NG archive and sorting by post-count
- Andrej Mitrovic (8/8) Mar 15 2011 I thought about making a kind of code-golf contest (stackoverflow usuall...
- bearophile (20/22) Mar 15 2011 I suggest you to add unit tests and Contracts to your CommonAA() and all...
- Andrej Mitrovic (12/36) Mar 15 2011 allSatisfy definitely doesn't work for a bunch of cases, like passing
- Andrej Mitrovic (5/13) Mar 15 2011 Correction: DMD complains about having parentheses, in fact it's an erro...
- bearophile (5/10) Mar 15 2011 Sorry, this time the uninformative text was mine :-) When I have suggest...
- Andrej Mitrovic (2/6) Mar 15 2011 Thanks, I didn't know about the bugs. .
I thought about making a kind of code-golf contest (stackoverflow usually has these contests). Only I would focus on improving each others code. So here's my idea of the day: Parse the newsgroup archive files from http://www.digitalmars.com/NewsGroup.html, and for each .html file output another .html file which has a list of topics sorted in post count order. Sure, there is NG software which does this automatically. But this is about doing it in D. Here's my implementation: https://gist.github.com/871631 Download a few .html files, save them in their own folder. Then copy my script into a .d file in the same folder, and just run it with RDMD. It will output the files in a `output`subfolder. It works on Windows, since that's all I've tested it with. There's a few things I've noticed: Using just a simple hash with the post count as the Key type wouldn't work. There are many topics which have the same post count number, and AA's can't hold duplicates. So I worked around this by making a wrapper which hides all the details of storing duplicates and traversal, I've called it `CommonAA`. I've also implemented an `allSatisfy` function which works on runtime arguments. There's a similar function in std.typetuple, but its only useful for compile-time arguments. There's probably a similar method someplace in std.algorithm, but I was too lazy to check. I thought it would be nice to have. I can see some ways to improve this. For one, I could have used Regex instead of indexOf. I could have also tried to avoid using a wrapper, however I haven't figured out a way to do this while having duplicate key types and having to sort them while keeping the Key types linked to the Values. Anywho, let's see you improve my code! It's just for fun and maybe we'll learn some tricks from one another. Have fun!
Mar 15 2011
Andrej Mitrovic:I've also implemented an `allSatisfy` function which works on runtime arguments. There's a similar function in std.typetuple, but its only useful for compile-time arguments. There's probably a similar method someplace in std.algorithm, but I was too lazy to check. I thought it would be nice to have.http://d.puremagic.com/issues/show_bug.cgi?id=4405Anywho, let's see you improve my code! It's just for fun and maybe we'll learn some tricks from one another. Have fun!I suggest you to add unit tests and Contracts to your CommonAA() and allSatisfy() :-) Have you tried to replace this: if (key in payload) { payload[key] ~= val; } else { payload[key] = [val]; } With just: payload[key] ~= val; I suggest to replace this: sortedKeys.sort; With: sortedKeys.sort(); Bye, bearophile
Mar 15 2011
On 3/15/11, bearophile <bearophileHUGS lycos.com> wrote:Andrej Mitrovic:Cool, I was afraid I was reinventing the wheel.I've also implemented an `allSatisfy` function which works on runtime arguments. There's a similar function in std.typetuple, but its only useful for compile-time arguments. There's probably a similar method someplace in std.algorithm, but I was too lazy to check. I thought it would be nice to have.http://d.puremagic.com/issues/show_bug.cgi?id=4405I suggest you to add unit tests and Contracts to your CommonAA() and allSatisfy() :-)allSatisfy definitely doesn't work for a bunch of cases, like passing a delegate instead of a literal. And CommonAA doesn't take into account things like removing elements, etc. It's definitely a half-ass implementation. :pHave you tried to replace this: if (key in payload) { payload[key] ~= val; } else { payload[key] = [val]; } With just: payload[key] ~= val;Good catch. Since the value type is an array I could simply append to it. Although one didn't exist yet, so I figure I had to assign something to an empty spot in an AA. Oh well..I suggest to replace this: sortedKeys.sort; With: sortedKeys.sort();Yes, I prefer it that way too. Since DMD doesn't complain about it (is sort even a property?), I missed it. Thanks for the input.
Mar 15 2011
On 3/15/11, Andrej Mitrovic <andrej.mitrovich gmail.com> wrote:Correction: DMD complains about having parentheses, in fact it's an error: ngparser.d(28): Error: undefined identifier module ngparser.sort So I've had to remove them. And again that's that uninformative error message which I don't like.I suggest to replace this: sortedKeys.sort; With: sortedKeys.sort();Yes, I prefer it that way too.
Mar 15 2011
Andrej Mitrovic:Correction: DMD complains about having parentheses, in fact it's an error: ngparser.d(28): Error: undefined identifier module ngparser.sort So I've had to remove them. And again that's that uninformative error message which I don't like.Sorry, this time the uninformative text was mine :-) When I have suggested you to add the () after the sort, I meant to suggest you to use the std.algorithm sort instead of the deprecated built-in one, because the built-in one is slow and it has bad bugs, like this one I've found: http://d.puremagic.com/issues/show_bug.cgi?id=2819 Bye, bearophile
Mar 15 2011
On 3/16/11, bearophile <bearophileHUGS lycos.com> wrote:I meant to suggest you to use the std.algorithm sort instead of the deprecated built-in one, because the built-in one is slow and it has bad bugs, like this one I've found: http://d.puremagic.com/issues/show_bug.cgi?id=2819Thanks, I didn't know about the bugs. .
Mar 15 2011