www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Naming things in Phobos - std.algorithm and writefln

reply Michel Fortin <michel.fortin michelf.com> writes:
In std.algorithm, wouldn't it be clearer if "splitter" was called 
"splitLazily" or "splitLazy"? "splitter" is a noun, but as a function 
shouldn't it be a verb. "makeSplitter" or "toSplitter" perhaps?

And what about the "array" function? Wouldn't it be clearer if it was 
"toArray" so we know we're preforming a convertion?

As you know, I tried to write some guidelines[1] for naming things in 
D. Those guidelines looks well at first glance, but then you look at 
Phobos and you see that half of it use some arbitrary naming rules. 
Take "writefln" for instance: following my guidelines (as they are 
currently written) it should be renamed to something like 
"writeFormattedLine".

 [1]: http://prowiki.org/wiki4d/wiki.cgi?DProgrammingGuidelines>

I don't necessarily want to change every function name, but if we want 
this document to make sense and Phobos to have some uniformity, both 
should be harmonized. Phobos should give the example and follow the 
guidelines and the guidelines should match Phobos, otherwise the 
guidelines have no weight for developers of other libraries and Phobos 
will make life harder to those who want to follow them.

For names we don't want to follow the guidelines, there should be some 
documentation explaining what rule they disobey and why.

I could take a look at std.algorithm and other modules in Phobos to 
list inconsistencies with the guidelines. From this we could make 
improvements both to the guideline document and the API. But before 
going too deep I think we should start with a few examples, such as 
those above, and see what to do with them.

What does everyone thinks about all this?


-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/
Aug 04 2009
next sibling parent reply Daniel Keep <daniel.keep.lists gmail.com> writes:
Michel Fortin wrote:
 In std.algorithm, wouldn't it be clearer if "splitter" was called
 "splitLazily" or "splitLazy"? "splitter" is a noun, but as a function
 shouldn't it be a verb. "makeSplitter" or "toSplitter" perhaps?

This is a specious argument. splitter's only purpose is to return an instance of a Splitter struct. You can't call it "splitLazily" or "splitLazy" because that implies that the function is doing work, when it really isn't. Following this line of reasoning, all structs should be renamed to verbs. "makeSplitter" is OK, but needlessly verbose. I think when you have a function whose only purpose is to construct something, or is strictly or conceptually pure, it's OK to use a noun for its name.
 And what about the "array" function? Wouldn't it be clearer if it was
 "toArray" so we know we're preforming a convertion?

Same reasoning as above. toArray is also fine; it's possibly more in line with other conversion functions.
 As you know, I tried to write some guidelines[1] for naming things in D.
 Those guidelines looks well at first glance, but then you look at Phobos
 and you see that half of it use some arbitrary naming rules. Take
 "writefln" for instance: following my guidelines (as they are currently
 written) it should be renamed to something like "writeFormattedLine".
 
 [1]: http://prowiki.org/wiki4d/wiki.cgi?DProgrammingGuidelines>

I think that's a problem with the guidelines. Anyone trying to turn D into Java should be shot. It's not clear what "fln" means, true... until you look up ANY output function with a f, ln or fln suffix and then it's obvious.
 ...
 
 What does everyone thinks about all this?

I think it's a good idea to have a good style guide; but rules only exist to make you think once before you break them. :)
Aug 05 2009
next sibling parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2009-08-05 03:29:11 -0400, Daniel Keep <daniel.keep.lists gmail.com> said:

 Michel Fortin wrote:
 In std.algorithm, wouldn't it be clearer if "splitter" was called
 "splitLazily" or "splitLazy"? "splitter" is a noun, but as a function
 shouldn't it be a verb. "makeSplitter" or "toSplitter" perhaps?

This is a specious argument. splitter's only purpose is to return an instance of a Splitter struct. You can't call it "splitLazily" or "splitLazy" because that implies that the function is doing work, when it really isn't.

I agree, those aren't very good names.
 Following this line of reasoning, all structs should be renamed to verbs.
 
 "makeSplitter" is OK, but needlessly verbose.
 
 I think when you have a function whose only purpose is to construct
 something, or is strictly or conceptually pure, it's OK to use a noun
 for its name.

Perhaps this rule should be added to the guideline then. But as it was said countless times in the all the threads about properties, many nouns are also verbs in English, and they can easily create confusion in this situation. Calling directly the constructor of Splitter would be great, but alas you can't deduce struct template aruments from its constructor and have to rely on a separate function. "makeSplitter" is the less consusing one in my opinion. Altenatively, we could rename "splitter" to "split". After all, the documentation says "Splits a range using another range or an element as a separator." If you want an array, you write split(...).toArray(), or perhaps better would be to have the resulting range implicitly-casted to an array when needed (but are implicit casts still in the pipe now that we have alias this?).
 And what about the "array" function? Wouldn't it be clearer if it was
 "toArray" so we know we're preforming a convertion?

Same reasoning as above. toArray is also fine; it's possibly more in line with other conversion functions.
 As you know, I tried to write some guidelines[1] for naming things in D.
 Those guidelines looks well at first glance, but then you look at Phobos
 and you see that half of it use some arbitrary naming rules. Take
 "writefln" for instance: following my guidelines (as they are currently
 written) it should be renamed to something like "writeFormattedLine".
 
 [1]: http://prowiki.org/wiki4d/wiki.cgi?DProgrammingGuidelines>

I think that's a problem with the guidelines. Anyone trying to turn D into Java should be shot. It's not clear what "fln" means, true... until you look up ANY output function with a f, ln or fln suffix and then it's obvious.

Please, don't make unsupported accusations as an excuse to shoot people. Instead, you should say what you dislike about Java and explain why. (My guess is you find System.Out.println too long.) I'm tring to see how I can adapt the guidelines to accept this function ("writefln") and I can't see any sensible rule I could add. Any idea? Alternatively, "writefln" could be an exception to the rules, but then the exception would need a better rationale than "it shouldn't look like Java". I mean, if Phobos makes unjustified exceptions to its naming conventions here and there for no good other reason than "it looks good", it breaks the concistency and makes function names less predictable and less readable. But it'd be easier to rename the functions to fit the convention: write -> write writeln -> writeLine writef -> writeFormat writefln -> writeLineFormat That way, if someone writes logging functions one day that takes formatted strings in the same way, he can reuse the convention: log logLine logFormat logLineFormat instead of "log", "logln", "logf", and "logfln". If you create a hash function, you can reuse the pattern too: hash hashLine hashFormat hashLineFormat instead of "hash", "hashln", "hashf" and "hashfln". And it goes on.
 ...
 
 What does everyone thinks about all this?

I think it's a good idea to have a good style guide; but rules only exist to make you think once before you break them. :)

By "thinking twice" you should be able to come with a good justification; and if you can't then you should follow the rules. That's fine by me. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Aug 05 2009
next sibling parent Sjoerd van Leent <svanleent gmail.com> writes:
Michel Fortin Wrote:

 As you know, I tried to write some guidelines[1] for naming things in D.
 Those guidelines looks well at first glance, but then you look at Phobos
 and you see that half of it use some arbitrary naming rules. Take
 "writefln" for instance: following my guidelines (as they are currently
 written) it should be renamed to something like "writeFormattedLine".
 
 [1]: http://prowiki.org/wiki4d/wiki.cgi?DProgrammingGuidelines>

I think that's a problem with the guidelines. Anyone trying to turn D into Java should be shot. It's not clear what "fln" means, true... until you look up ANY output function with a f, ln or fln suffix and then it's obvious.

Please, don't make unsupported accusations as an excuse to shoot people. Instead, you should say what you dislike about Java and explain why. (My guess is you find System.Out.println too long.) I'm tring to see how I can adapt the guidelines to accept this function ("writefln") and I can't see any sensible rule I could add. Any idea? Alternatively, "writefln" could be an exception to the rules, but then the exception would need a better rationale than "it shouldn't look like Java". I mean, if Phobos makes unjustified exceptions to its naming conventions here and there for no good other reason than "it looks good", it breaks the concistency and makes function names less predictable and less readable. But it'd be easier to rename the functions to fit the convention: write -> write writeln -> writeLine writef -> writeFormat writefln -> writeLineFormat That way, if someone writes logging functions one day that takes formatted strings in the same way, he can reuse the convention: log logLine logFormat logLineFormat instead of "log", "logln", "logf", and "logfln". If you create a hash function, you can reuse the pattern too: hash hashLine hashFormat hashLineFormat instead of "hash", "hashln", "hashf" and "hashfln". And it goes on.
 ...
 
 What does everyone thinks about all this?

I think it's a good idea to have a good style guide; but rules only exist to make you think once before you break them. :)

By "thinking twice" you should be able to come with a good justification; and if you can't then you should follow the rules. That's fine by me.

I think the real problem underlying the wish to use writefln versus writeFormatLine (or anything like that), is that C programmers are in the habit of using very short names. But in my personal experience, most languages I use have a short formatted version of write..., probably because it is necessary many times. Although I agree that writefln, if adapted to the convention, should become writeFormatLine, I also understand the clumsiness of writing it. As far as I know, writefln is with us for a very long time. But it doesn't say that it should or should not be changed. I think that no one has been given any thought to it. I want to know: do we use writefln often, or is it just convenience, when writing out a something. I would imagine to have a formatter object that accepts a delegate which writes strings. I would as thus remove writefln, and just have the function write. The formatter object could be reponsible to actually use it, for example: Formatter.out(write, "%s%s", "Hello World", newline) The function "write" can still be used to emit directly, such as: write("Hello world\n") I think this is the real problem of writefln. Not the convention, but the approach.
Aug 05 2009
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Michel Fortin wrote:
 Alternatively, "writefln" could be an exception to the rules, but then 
 the exception would need a better rationale than "it shouldn't look like 
 Java". I mean, if Phobos makes unjustified exceptions to its naming 
 conventions here and there for no good other reason than "it looks 
 good", it breaks the concistency and makes function names less 
 predictable and less readable.

I agree that Phobos' names could use a good overhaul. That would make it easier for growing it too. Certain names could be kept short and intuitive although they don't fit common conventions. Andrei
Aug 05 2009
parent reply Jeremie Pelletier <jeremiep gmail.com> writes:
Andrei Alexandrescu Wrote:

 Michel Fortin wrote:
 Alternatively, "writefln" could be an exception to the rules, but then 
 the exception would need a better rationale than "it shouldn't look like 
 Java". I mean, if Phobos makes unjustified exceptions to its naming 
 conventions here and there for no good other reason than "it looks 
 good", it breaks the concistency and makes function names less 
 predictable and less readable.

I agree that Phobos' names could use a good overhaul. That would make it easier for growing it too. Certain names could be kept short and intuitive although they don't fit common conventions. Andrei

You could also use aliases to make everyone happy, thats what I do in my local phobos source, its just a bitch to upgrade to the newest dmd while keeping my own changes ;) One of the most annoying names I've had in phobos was the std.utf.encode/decode functions. When I would come across these in some code it wasnt descriptive enough as to what was being done. So I rewrote the std.utf module to use names such as toUTF8, toUTF16 and toUnicode, and made a generic toUTF template to call the proper one. Then aliased encode and decode to their corresponding toUTF calls to keep compatibility with the rest of phobos, works like a charm. I can mail you my version of std.utf if you want Andrei.
Aug 05 2009
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Jeremie Pelletier wrote:
 Andrei Alexandrescu Wrote:
 
 Michel Fortin wrote:
 Alternatively, "writefln" could be an exception to the rules, but then 
 the exception would need a better rationale than "it shouldn't look like 
 Java". I mean, if Phobos makes unjustified exceptions to its naming 
 conventions here and there for no good other reason than "it looks 
 good", it breaks the concistency and makes function names less 
 predictable and less readable.

easier for growing it too. Certain names could be kept short and intuitive although they don't fit common conventions. Andrei

You could also use aliases to make everyone happy, thats what I do in my local phobos source, its just a bitch to upgrade to the newest dmd while keeping my own changes ;) One of the most annoying names I've had in phobos was the std.utf.encode/decode functions. When I would come across these in some code it wasnt descriptive enough as to what was being done. So I rewrote the std.utf module to use names such as toUTF8, toUTF16 and toUnicode, and made a generic toUTF template to call the proper one. Then aliased encode and decode to their corresponding toUTF calls to keep compatibility with the rest of phobos, works like a charm. I can mail you my version of std.utf if you want Andrei.

I'd be glad to look at it! Just give me time. Andrei
Aug 05 2009
prev sibling next sibling parent reply Daniel Keep <daniel.keep.lists gmail.com> writes:
Michel Fortin wrote:
 On 2009-08-05 03:29:11 -0400, Daniel Keep <daniel.keep.lists gmail.com>
 said:
 "makeSplitter" is OK, but needlessly verbose.

 I think when you have a function whose only purpose is to construct
 something, or is strictly or conceptually pure, it's OK to use a noun
 for its name.

Perhaps this rule should be added to the guideline then. But as it was said countless times in the all the threads about properties, many nouns are also verbs in English, and they can easily create confusion in this situation.

Yes, there are lots of nouns that are also verbs but "splitter" isn't one of them. Just because they exist doesn't mean you can't ever use nouns without qualification.
 Calling directly the constructor of Splitter would be great, but alas
 you can't deduce struct template aruments from its constructor and have
 to rely on a separate function.

My point was that there would be ZERO difference assuming we could; arguing that just because it's implemented using a function that it must be extra verbose is just silly.
 "makeSplitter" is the less consusing one in my opinion.

It's more accurate but, as I said, needlessly verbose. "splitter" is a noun and it's hard to even deliberately think of ways to misinterpret it. I mean, we use "sin(x)" without problems although it could also have been written "computeSineOf(x)". Except, of course, then you'd have a hoard of angry numerical programmers knocking on your door with torches and pitchforks.
 Altenatively, we could rename "splitter" to "split". After all, the
 documentation says "Splits a range using another range or an element as
 a separator." If you want an array, you write split(...).toArray(), or
 perhaps better would be to have the resulting range implicitly-casted to
 an array when needed (but are implicit casts still in the pipe now that
 we have alias this?).

No, naming it "split" implies that it's performing the "split" action on its arguments, which is *isn't*. Implying that it does *anything* in the function name is misleading since nothing actually takes place until *after* the function has returned.
 ...

I think that's a problem with the guidelines. Anyone trying to turn D into Java should be shot. It's not clear what "fln" means, true... until you look up ANY output function with a f, ln or fln suffix and then it's obvious.

Please, don't make unsupported accusations as an excuse to shoot people. Instead, you should say what you dislike about Java and explain why. (My guess is you find System.Out.println too long.)

I was attempting to be jovial. [1] Java and, to a lesser extent, .NET have this serious problem where all of the names are needlessly long and verbose. This makes writing actual code tedious and annoying. No, I do not use an IDE and I shouldn't NEED autocomplete to be able to write code efficiently. Also, making code longer means less of it fits on a line, less on a page and most critically of all: less in your brain. Look at legalese: it's incredibly accurate but also almost impossible to read. I've found that it takes so long to read a sentence that by the time you finish, you've forgotten what it was about. Just because a name is unambiguous doesn't automatically make it good. writefln is a good name because the root indicates what it does "it writes to something", the f comes from printf and ln is used in a few languages. What's more, even if you don't know what those suffixes mean initially, because they're used *consistently*, once you learn it you can apply it all over the place. And best of all: it's short to type. [2]
 I'm tring to see how I can adapt the guidelines to accept this function
 ("writefln") and I can't see any sensible rule I could add. Any idea?

"Short suffixes are good when applied consistently across multiple symbols."?
 Alternatively, "writefln" could be an exception to the rules, but then
 the exception would need a better rationale than "it shouldn't look like
 Java". I mean, if Phobos makes unjustified exceptions to its naming
 conventions here and there for no good other reason than "it looks
 good", it breaks the concistency and makes function names less
 predictable and less readable.

As I indicated above, it's descriptive, consistent and short; a great name.
 But it'd be easier to rename the functions to fit the convention:

I find that supposition hard to accept.
 ...
 
 That way, if someone writes logging functions one day that takes
 formatted strings in the same way, he can reuse the convention:
 
     log
     logLine
     logFormat
     logLineFormat
 
 instead of "log", "logln", "logf", and "logfln". If you create a hash
 function, you can reuse the pattern too:
 
     hash
     hashLine
     hashFormat
     hashLineFormat
 
 instead of "hash", "hashln", "hashf" and "hashfln". And it goes on.

How is this an improvement? If we accept that people know what the "f" and "ln" suffixes mean (and given that they will be exposed to this in the course of writing a Hello, World! program), what benefit is gained from increasing the length and complexity of the identifiers? Saying you can re-use the convention is irrelevant because the exact same thing can be said of the shorter suffixes. --- [1] I mean I was joking about you trying to turn D into Java. People who ARE trying to turn D into Java *definitely* need to be shot. Then buried upside-down at a crossroads with a steak through their heart. Then burn and salt the earth afterwards just to be safe; you can't take your chances with that sort of malignant evil... [2] And just to pre-empt it: this doesn't mean 'ofln' is a good name for an output function. Doesn't mean it's bad either; that depends on context.
Aug 05 2009
next sibling parent reply Benji Smith <dlanguage benjismith.net> writes:
Daniel Keep wrote:
 That way, if someone writes logging functions one day that takes
 formatted strings in the same way, he can reuse the convention:

     log
     logLine
     logFormat
     logLineFormat

 instead of "log", "logln", "logf", and "logfln". If you create a hash
 function, you can reuse the pattern too:

     hash
     hashLine
     hashFormat
     hashLineFormat

 instead of "hash", "hashln", "hashf" and "hashfln". And it goes on.

How is this an improvement? If we accept that people know what the "f" and "ln" suffixes mean (and given that they will be exposed to this in the course of writing a Hello, World! program), what benefit is gained from increasing the length and complexity of the identifiers? Saying you can re-use the convention is irrelevant because the exact same thing can be said of the shorter suffixes.

The thing about one-letter abbreviations is that they mean different things in different contexts. An "f" might mean "formatted" in a "writefln" function, but it means "file" in an "ifstream" and "floating point" in the "fenv" module. In those cases (and in many more), there's no convention than can be reused. You just have to memorize stuff. Memorization was a perfectly acceptable solution back in the days of C, when standard libraries were small. But I think any modern standard library, with scores of modules and hundreds (or thousands) of functions, needs a better strategy. Coming from a Java background, I much prefer to give up terseness in favor of clarity. Though I recognize that verbosity has its own pitfalls, I think it's the lesser evil. --benji
Aug 05 2009
parent Daniel Keep <daniel.keep.lists gmail.com> writes:
Benji Smith wrote:
 ...
 
 The thing about one-letter abbreviations is that they mean different
 things in different contexts. An "f" might mean "formatted" in a
 "writefln" function, but it means "file" in an "ifstream" and "floating
 point" in the "fenv" module.

I don't think this applies. Firstly, I was talking about suffixes, not abbreviations appearing in other parts of a name. Secondly, the convention is: "an f suffix on an output method means formatting." That doesn't conflict with fenv at all since the f there isn't a suffix and "env" has nothing to do with output at all. Besides, even if you thought the "f" was "formatter", you'd be quickly dissuaded when you looked in the docs and saw "fenv.h is the standard header providing access to the floating point environment." The ambiguous case is ifstream which could be interpreted as "input formatted stream" if you were really trying. But here's the kicker: "ifstream" is a bad class name ANYWAY because it's all lowercase and highly ambiguous. A sane name for ifstream would be FileInput which drops the redundant "stream", expands both the "i" and "f" to their full names and is only one character longer. Win! Finally, I don't think you can just toss context out the window completely and say 'but there's an f; it could mean anything!'. No, interpret it based on context and all will be well. If you have an output function on a file with an 'f' suffix, odds are it means "format". If you have a logging function on a logging object with an 'f' suffix, odds are it means "format". If you have an inverse square root function on a math object with an 'f' suffix, odds are it really doesn't mean "format". Given we have overloading, my guess would be "fast", and then I'd check the docs.
 In those cases (and in many more), there's no convention than can be
 reused. You just have to memorize stuff. Memorization was a perfectly
 acceptable solution back in the days of C, when standard libraries were
 small. But I think any modern standard library, with scores of modules
 and hundreds (or thousands) of functions, needs a better strategy.

You can't not memorise stuff. You have to look up the docs if you don't remember what a function's arguments are or what its semantics are irrespective of the name.
 Coming from a Java background, I much prefer to give up terseness in
 favor of clarity. Though I recognize that verbosity has its own
 pitfalls, I think it's the lesser evil.
 
 --benji

It's alright; enough electro-therapy will cure you of your Java tendencies. We CAN save you!
Aug 05 2009
prev sibling parent Michel Fortin <michel.fortin michelf.com> writes:
On 2009-08-05 11:11:20 -0400, Daniel Keep <daniel.keep.lists gmail.com> said:

 Java and, to a lesser extent, .NET have this serious problem where all
 of the names are needlessly long and verbose.  This makes writing actual
 code tedious and annoying.  No, I do not use an IDE and I shouldn't NEED
 autocomplete to be able to write code efficiently.

I do Objective-C, so I know a lot about long names. :-) What's great about them is that they make the code a lot very readable (especially with Objective-C where methods read like a sentence). What's less great is that they're long to type. (An no I don't want to see sentense-long funciton names in D by the way, the D grammar isn't appropriate for that.)
 Also, making code longer means less of it fits on a line, less on a page
 and most critically of all: less in your brain. Look at legalese: it's
 incredibly accurate but also almost impossible to read.  I've found that
 it takes so long to read a sentence that by the time you finish, you've
 forgotten what it was about.

My take is that the sentenses are too long, not the actual words. Would the legalise be more readable if it were using abbreviated words? I believe it'd be worse. Same with code. If you put too much in a statement it'll become hard to decipher.
 Just because a name is unambiguous doesn't automatically make it good.
 
 writefln is a good name because the root indicates what it does "it
 writes to something", the f comes from printf and ln is used in a few
 languages.

You mean that a good name is one that begins with a word followed by half-conventions from various other languages merged together? That other languages use some other convention isn't really a justification for naming D functions against the D conventions.
 What's more, even if you don't know what those suffixes mean initially,
 because they're used *consistently*, once you learn it you can apply it
 all over the place.

But can they be used consistently? In C, you have a lot of math functions having the suffix "f", and it has nothing to do with them being formatted. Can we restrict the usage of the suffix "f" to mean "formatted" in the guidelines? Can we restrict the usage of the suffix "ln" to mean "line"? Also, that convention doesn't scale. There is a very limited number of short suffix. Either you'll start reusing them for other things, or you'll have a bunch of legacy functions using a short prefix and new functions with more explicit names.
 And best of all: it's short to type. [2]

 [2] And just to pre-empt it: this doesn't mean 'ofln' is a good name for
 an output function.  Doesn't mean it's bad either; that depends on context.

True indeed. But D allows symbol renaming in the import declaration. If you're going to use a function a lot, you can rename it to something shorter for a module. If you're not using it much, the long name will be more explicit. import std.stdio : ofln = outputFormattedLine;
 I'm tring to see how I can adapt the guidelines to accept this function
 ("writefln") and I can't see any sensible rule I could add. Any idea?

"Short suffixes are good when applied consistently across multiple symbols."?

Does this justify the sufix being lowercase. It's a different word after all. Why not "writeFLn"?
 Alternatively, "writefln" could be an exception to the rules, but then
 the exception would need a better rationale than "it shouldn't look like
 Java". I mean, if Phobos makes unjustified exceptions to its naming
 conventions here and there for no good other reason than "it looks
 good", it breaks the concistency and makes function names less
 predictable and less readable.

As I indicated above, it's descriptive, consistent and short; a great name.

I disagree, it's descriptive up to until "fln". At least that's my opinion. Don't misunderstand me: I like writefln for it's shortness and because it's easy to write. But I don't see how I can justify the name without adding some arbitrary exceptions that look dumb. Perhaps I should just leave the names alone in Phobos and let Walter's style guide do the job. It says: "Names formed by joining multiple words should have each word other than the first capitalized." which means you can write names as you like as long as it's not words concatenated together. Source: <http://www.digitalmars.com/d/2.0/dstyle.html> -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Aug 07 2009
prev sibling parent "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
Michel Fortin wrote:
 I'm tring to see how I can adapt the guidelines to accept this function 
 ("writefln") and I can't see any sensible rule I could add. Any idea?
 
 Alternatively, "writefln" could be an exception to the rules, but then 
 the exception would need a better rationale than "it shouldn't look like 
 Java".

Um, how about "writefln is AWESOME" for a reason? :) But seriously, I use writefln ALL the time. I can't find the words to describe how much nicer it is than Tango's Stdout.formatln("blah") or Java's System.out.println("aargh"). -Lars
Aug 05 2009
prev sibling next sibling parent reply Sergey Gromov <snake.scaly gmail.com> writes:
Wed, 05 Aug 2009 17:29:11 +1000, Daniel Keep wrote:

 Michel Fortin wrote:
 In std.algorithm, wouldn't it be clearer if "splitter" was called
 "splitLazily" or "splitLazy"? "splitter" is a noun, but as a function
 shouldn't it be a verb. "makeSplitter" or "toSplitter" perhaps?

This is a specious argument. splitter's only purpose is to return an instance of a Splitter struct. You can't call it "splitLazily" or "splitLazy" because that implies that the function is doing work, when it really isn't.

That's if you know how it works. But if you just use these functions, it's not even remotely obvious what the difference is, and the difference in naming is so subtle that many people will be doomed to forever confuse these functions, myself included. I confuse getopt and getopts shell functions in the same way. I simply can't remember which is which.
Aug 05 2009
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Sergey Gromov wrote:
 Wed, 05 Aug 2009 17:29:11 +1000, Daniel Keep wrote:
 
 Michel Fortin wrote:
 In std.algorithm, wouldn't it be clearer if "splitter" was called
 "splitLazily" or "splitLazy"? "splitter" is a noun, but as a function
 shouldn't it be a verb. "makeSplitter" or "toSplitter" perhaps?

splitter's only purpose is to return an instance of a Splitter struct. You can't call it "splitLazily" or "splitLazy" because that implies that the function is doing work, when it really isn't.

That's if you know how it works. But if you just use these functions, it's not even remotely obvious what the difference is, and the difference in naming is so subtle that many people will be doomed to forever confuse these functions, myself included. I confuse getopt and getopts shell functions in the same way. I simply can't remember which is which.

Very true. If it weren't for backwards compatibility, I'd simply have split() do the lazy thing. Then array(split()) would do the eager thing. Andrei
Aug 05 2009
next sibling parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2009-08-05 17:40:34 -0400, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 Sergey Gromov wrote:
 Wed, 05 Aug 2009 17:29:11 +1000, Daniel Keep wrote:
 
 Michel Fortin wrote:
 In std.algorithm, wouldn't it be clearer if "splitter" was called
 "splitLazily" or "splitLazy"? "splitter" is a noun, but as a function
 shouldn't it be a verb. "makeSplitter" or "toSplitter" perhaps?

splitter's only purpose is to return an instance of a Splitter struct. You can't call it "splitLazily" or "splitLazy" because that implies that the function is doing work, when it really isn't.

That's if you know how it works. But if you just use these functions, it's not even remotely obvious what the difference is, and the difference in naming is so subtle that many people will be doomed to forever confuse these functions, myself included. I confuse getopt and getopts shell functions in the same way. I simply can't remember which is which.

Very true. If it weren't for backwards compatibility, I'd simply have split() do the lazy thing. Then array(split()) would do the eager thing.

But then I thought D2 was about making things better without worrying too much about backward compatibility. I find it dubious that we are ready to do a breaking language change about how properties work, but when it comes to replacing some standard library functions we can't because of backward compatibility. What is the criterion for an acceptable breaking changes? -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Aug 05 2009
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Michel Fortin wrote:
 On 2009-08-05 17:40:34 -0400, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org> said:
 
 Sergey Gromov wrote:
 Wed, 05 Aug 2009 17:29:11 +1000, Daniel Keep wrote:

 Michel Fortin wrote:
 In std.algorithm, wouldn't it be clearer if "splitter" was called
 "splitLazily" or "splitLazy"? "splitter" is a noun, but as a function
 shouldn't it be a verb. "makeSplitter" or "toSplitter" perhaps?

splitter's only purpose is to return an instance of a Splitter struct. You can't call it "splitLazily" or "splitLazy" because that implies that the function is doing work, when it really isn't.

That's if you know how it works. But if you just use these functions, it's not even remotely obvious what the difference is, and the difference in naming is so subtle that many people will be doomed to forever confuse these functions, myself included. I confuse getopt and getopts shell functions in the same way. I simply can't remember which is which.

Very true. If it weren't for backwards compatibility, I'd simply have split() do the lazy thing. Then array(split()) would do the eager thing.

But then I thought D2 was about making things better without worrying too much about backward compatibility. I find it dubious that we are ready to do a breaking language change about how properties work, but when it comes to replacing some standard library functions we can't because of backward compatibility. What is the criterion for an acceptable breaking changes?

That's what I keep on telling Walter! That, and the fact that American cars suck! Andrei
Aug 05 2009
parent Michel Fortin <michel.fortin michelf.com> writes:
On 2009-08-05 20:08:43 -0400, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 But then I thought D2 was about making things better without worrying 
 too much about backward compatibility. I find it dubious that we are 
 ready to do a breaking language change about how properties work, but 
 when it comes to replacing some standard library functions we can't 
 because of backward compatibility. What is the criterion for an 
 acceptable breaking changes?

That's what I keep on telling Walter! That, and the fact that American cars suck!

You mean it's Walter who don't want to break this kind of compatibilty? In any case, if Walter can fix alias this so that we can really do implicit casts, it will become possible to return a lazy range that implicitly convert to an array when needed. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Aug 05 2009
prev sibling parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2009-08-05 18:12:16 -0400, Bill Baxter <wbaxter gmail.com> said:

 On Wed, Aug 5, 2009 at 2:40 PM, Andrei
 Alexandrescu<SeeWebsiteForEmail erdani.org> wrote:
 Sergey Gromov wrote:
 
 Wed, 05 Aug 2009 17:29:11 +1000, Daniel Keep wrote:
 
 Michel Fortin wrote:
 
 In std.algorithm, wouldn't it be clearer if "splitter" was called
 "splitLazily" or "splitLazy"? "splitter" is a noun, but as a function
 shouldn't it be a verb. "makeSplitter" or "toSplitter" perhaps?

This is a specious argument. splitter's only purpose is to return an instance of a Splitter struct. You can't call it "splitLazily" or "splitLazy" because that implies tha



 the function is doing work, when it really isn't.

That's if you know how it works. But if you just use these functions, it's not even remotely obvious what the difference is, and the difference in naming is so subtle that many people will be doomed to forever confuse these functions, myself included.  I confuse getopt and getopts shell functions in the same wa


 I simply can't remember which is which.

Very true. If it weren't for backwards compatibility, I'd simply have split() do the lazy thing. Then array(split()) would do the eager thing. Andrei

Maybe introduce a convention like python and bearophile? "foo" for eager things and "xfoo" for lazy things is what they use. At least when you first see xfoo, you don't automatically assume you know what that "x" means, and go look it up if you don't know.

One question to ask is which one should be the default. If lazy should be the default then we want the lazy on to be called "split" and the non-lazy one to be called "eagerSplit" or whatever other convention for non-lazy. "str.eagerSplit()" would just be a shortcut for "str.split().toArray()". Also, with implicit casts we wouldn't even need to bother about having a different names for lazy and non-lazy results, we could just do: string[] parts = str.split(); and it would implicitly convert the lazy range to an array. Can this be done with alias this? Would need to test. struct Range(T) { T[] toArray(); alias toArray this; ... other range things here... } -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Aug 05 2009
parent Michel Fortin <michel.fortin michelf.com> writes:
On 2009-08-05 19:16:17 -0400, Jarrett Billingsley 
<jarrett.billingsley gmail.com> said:

 On Wed, Aug 5, 2009 at 7:00 PM, Michel Fortin<michel.fortin michelf.com> wr
 ote:
 Also, with implicit casts we wouldn't even need to bother about having a
 different names for lazy and non-lazy results, we could just do:
 
        string[] parts = str.split();
 
 and it would implicitly convert the lazy range to an array. Can this be d

 with alias this? Would need to test.
 
        struct Range(T)
        {
                T[] toArray();
                alias toArray this;
 
                ... other range things here...
        }

Sadly it doesn't work. I was hopeful when I found this works: struct X { int x; alias x this; } auto x = X(5); int y = x; // works! but if you alias a method that returns int to 'this', that line fails.

Looks pretty much like this bug. Put your vote on it. <http://d.puremagic.com/issues/show_bug.cgi?id=2814> There's also such an example in the original enhancement proposal for alias this from Andrei. <http://d.puremagic.com/issues/show_bug.cgi?id=2631> -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Aug 05 2009
prev sibling next sibling parent Bill Baxter <wbaxter gmail.com> writes:
On Wed, Aug 5, 2009 at 2:40 PM, Andrei
Alexandrescu<SeeWebsiteForEmail erdani.org> wrote:
 Sergey Gromov wrote:
 Wed, 05 Aug 2009 17:29:11 +1000, Daniel Keep wrote:

 Michel Fortin wrote:
 In std.algorithm, wouldn't it be clearer if "splitter" was called
 "splitLazily" or "splitLazy"? "splitter" is a noun, but as a function
 shouldn't it be a verb. "makeSplitter" or "toSplitter" perhaps?

This is a specious argument. splitter's only purpose is to return an instance of a Splitter struct. You can't call it "splitLazily" or "splitLazy" because that implies tha=



 the function is doing work, when it really isn't.

That's if you know how it works. But if you just use these functions, it's not even remotely obvious what the difference is, and the difference in naming is so subtle that many people will be doomed to forever confuse these functions, myself included. =A0I confuse getopt and getopts shell functions in the same wa=


 I simply can't remember which is which.

Very true. If it weren't for backwards compatibility, I'd simply have split() do the lazy thing. Then array(split()) would do the eager thing. Andrei

Maybe introduce a convention like python and bearophile? "foo" for eager things and "xfoo" for lazy things is what they use. At least when you first see xfoo, you don't automatically assume you know what that "x" means, and go look it up if you don't know. --bb
Aug 05 2009
prev sibling next sibling parent Jarrett Billingsley <jarrett.billingsley gmail.com> writes:
On Wed, Aug 5, 2009 at 7:00 PM, Michel Fortin<michel.fortin michelf.com> wr=
ote:
 Also, with implicit casts we wouldn't even need to bother about having a
 different names for lazy and non-lazy results, we could just do:

 =A0 =A0 =A0 =A0string[] parts =3D str.split();

 and it would implicitly convert the lazy range to an array. Can this be d=

 with alias this? Would need to test.

 =A0 =A0 =A0 =A0struct Range(T)
 =A0 =A0 =A0 =A0{
 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0T[]=A0toArray();
 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0alias toArray this;

 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0... other range things here...
 =A0 =A0 =A0 =A0}

Sadly it doesn't work. I was hopeful when I found this works: struct X { int x; alias x this; } auto x =3D X(5); int y =3D x; // works! but if you alias a method that returns int to 'this', that line fails.
Aug 05 2009
prev sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 05 Aug 2009 18:47:35 -0400, Michel Fortin  
<michel.fortin michelf.com> wrote:

 On 2009-08-05 17:40:34 -0400, Andrei Alexandrescu  
 <SeeWebsiteForEmail erdani.org> said:

 Sergey Gromov wrote:
 Wed, 05 Aug 2009 17:29:11 +1000, Daniel Keep wrote:

 Michel Fortin wrote:
 In std.algorithm, wouldn't it be clearer if "splitter" was called
 "splitLazily" or "splitLazy"? "splitter" is a noun, but as a function
 shouldn't it be a verb. "makeSplitter" or "toSplitter" perhaps?

splitter's only purpose is to return an instance of a Splitter struct. You can't call it "splitLazily" or "splitLazy" because that implies that the function is doing work, when it really isn't.

But if you just use these functions, it's not even remotely obvious what the difference is, and the difference in naming is so subtle that many people will be doomed to forever confuse these functions, myself included. I confuse getopt and getopts shell functions in the same way. I simply can't remember which is which.

split() do the lazy thing. Then array(split()) would do the eager thing.

But then I thought D2 was about making things better without worrying too much about backward compatibility. I find it dubious that we are ready to do a breaking language change about how properties work, but when it comes to replacing some standard library functions we can't because of backward compatibility. What is the criterion for an acceptable breaking changes?

About 500 more posts ;) -Steve
Aug 05 2009
prev sibling next sibling parent reply Robert Fraser <fraserofthenight gmail.com> writes:
Michel Fortin wrote:
 As you know, I tried to write some guidelines[1] for naming things in D. 
 Those guidelines looks well at first glance, but then you look at Phobos 
 and you see that half of it use some arbitrary naming rules. Take 
 "writefln" for instance: following my guidelines (as they are currently 
 written) it should be renamed to something like "writeFormattedLine".
 
 [1]: http://prowiki.org/wiki4d/wiki.cgi?DProgrammingGuidelines>

I think naming guidelines aren't a bad thing, but they can be taken too far. We use an automated tool at work to check code and today I was forced to change the name of some classes because they ended in "Queue" or "Dictionary" and "fix the spelling" of "Http" because it thought it was Hungarian notation.
Aug 05 2009
parent Michel Fortin <michel.fortin michelf.com> writes:
On 2009-08-05 03:49:34 -0400, Robert Fraser <fraserofthenight gmail.com> said:

 Michel Fortin wrote:
 As you know, I tried to write some guidelines[1] for naming things in 
 D. Those guidelines looks well at first glance, but then you look at 
 Phobos and you see that half of it use some arbitrary naming rules. 
 Take "writefln" for instance: following my guidelines (as they are 
 currently written) it should be renamed to something like 
 "writeFormattedLine".
 
 [1]: http://prowiki.org/wiki4d/wiki.cgi?DProgrammingGuidelines>

I think naming guidelines aren't a bad thing, but they can be taken too far. We use an automated tool at work to check code and today I was forced to change the name of some classes because they ended in "Queue" or "Dictionary" and "fix the spelling" of "Http" because it thought it was Hungarian notation.

That's indeed ridiculous. But that's not an example of guidelines gone too far, that's an example of a silly tool that's not even able to apply guidelines correctly. The guidelines I wrote are human-verifiable, not machine verifiable, and sometime require judgement. Exceptions to the guidelines are fine, as long as they have a good rationale supporting them. Even better than making exceptions to the guidelines is creating standardized patterns and adding it to the guideline. as I did with "to" functions (they don't start with a verb like the guideline says, but they are used as a convention for convertion functions). -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Aug 05 2009
prev sibling parent reply Don <nospam nospam.com> writes:
Michel Fortin wrote:
 In std.algorithm, wouldn't it be clearer if "splitter" was called 
 "splitLazily" or "splitLazy"? "splitter" is a noun, but as a function 
 shouldn't it be a verb. "makeSplitter" or "toSplitter" perhaps?
 
 And what about the "array" function? Wouldn't it be clearer if it was 
 "toArray" so we know we're preforming a convertion?
 
 As you know, I tried to write some guidelines[1] for naming things in D. 
 Those guidelines looks well at first glance, but then you look at Phobos 
 and you see that half of it use some arbitrary naming rules. Take 
 "writefln" for instance: following my guidelines (as they are currently 
 written) it should be renamed to something like "writeFormattedLine".

There should be an exception for functions which are analogous to C functions and have well established names in C. (eg, printf). Probably for famous functions in other languages, too. writeln() comes from Pascal, analogy with printf gives us writefln(). So that one's OK.
 I could take a look at std.algorithm and other modules in Phobos to list 
 inconsistencies with the guidelines. From this we could make 
 improvements both to the guideline document and the API. But before 
 going too deep I think we should start with a few examples, such as 
 those above, and see what to do with them.
 
 What does everyone thinks about all this?

Yes, this is great. A review process would be very valuable. Please check the names in std.math. For the most part I have taken the names from the IEEE754-2008 standard, but please make suggestions. As we move towards finalizing D2.0, that module should be one of the first to have its interface frozen.
Aug 05 2009
parent Michel Fortin <michel.fortin michelf.com> writes:
On 2009-08-05 08:15:49 -0400, Don <nospam nospam.com> said:

 Michel Fortin wrote:
 In std.algorithm, wouldn't it be clearer if "splitter" was called 
 "splitLazily" or "splitLazy"? "splitter" is a noun, but as a function 
 shouldn't it be a verb. "makeSplitter" or "toSplitter" perhaps?
 
 And what about the "array" function? Wouldn't it be clearer if it was 
 "toArray" so we know we're preforming a convertion?
 
 As you know, I tried to write some guidelines[1] for naming things in 
 D. Those guidelines looks well at first glance, but then you look at 
 Phobos and you see that half of it use some arbitrary naming rules. 
 Take "writefln" for instance: following my guidelines (as they are 
 currently written) it should be renamed to something like 
 "writeFormattedLine".

There should be an exception for functions which are analogous to C functions and have well established names in C. (eg, printf). Probably for famous functions in other languages, too.

That I mostly agree. "printf" can stay as it is in D. There's already rule for that in the guidelines: """ When writing bindings for code in other languages or system functions, types, variables, constants, functions, and arguments can keep their original names. """
 writeln() comes from Pascal, analogy with printf gives us writefln(). 
 So that one's OK.

That's an explanantion, but is it really a justification? Tell me why it's OK. Is there any significant advantage in borrowing the "writeln" Pascal function name over using the D convention to name the function, say "writeLine"? I'm thinking aobut it and realizing that "writeln" and/or "writefln" are *the* functions you always see in hello world programs and other introductions to D. I'm wondering: if we can't follow a naming convention for these, then how can you tell people to follow the conventions elsewhere? It's not like we have been following a convention that holds across many languages (unlike the math functions, more on that later). The only sort of half convention I see from a wide range of languages is that functions named using variants of "print", "write", "echo", or "put" usualy write to the standard output when called in a global context or on a global object representing the console[^1]. [^1]: Based on reading many hello world programs from <http://helloworld.googletoad.com/>, considering only command-line programs.
 I could take a look at std.algorithm and other modules in Phobos to 
 list inconsistencies with the guidelines. From this we could make 
 improvements both to the guideline document and the API. But before 
 going too deep I think we should start with a few examples, such as 
 those above, and see what to do with them.
 
 What does everyone thinks about all this?

Yes, this is great. A review process would be very valuable. Please check the names in std.math. For the most part I have taken the names from the IEEE754-2008 standard, but please make suggestions. As we move towards finalizing D2.0, that module should be one of the first to have its interface frozen.

Great... well, for math functions I think binding them to a cross-language standard makes sense. I mean, should we really rename "sqrt" to "squareRoot"? And what about "lrint", rename as "roundToNearestLong"? Renaming would make code more readable and function names more guessable for sure (especially with IDEs with suggestion boxes: you write "round" and then you see a list of the available rounding algorithms). But using these long names would make it harder to convert programs to D. I'm thinking that perhaps we could alias the standard names to our functions so that both names are valid. I'm not sure what is best here. Anyway I'd like to have a look, but it appears like I'd need to be an IEEE member to download the final standard. What's intriguing is that I found an old draft from 2007 at <http://www.validlab.com/754R/nonabelian.com/754/comments/Q754.129.pdf>, and it doesn't even mention "sqrt", it has "squareRoot" instead. So I'm a little confused right now: you said you took the most names from the IEEE 754-2008 standard, is the final standard different or did you just take some names (like isNaN) from the standard while keeping the C name for most others (like sqrt)? (I'm just trying to understand here.) -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Aug 07 2009