www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - string and char[] in Phobos

reply Puming <zhaopuming gmail.com> writes:
Hi,

I saw from the forum that functions with string like arguments 
better use `in char[]` instead of `string` type, because then it 
can accept both string and char[] types.

But recently when actually using D, I found that many phobos 
functions/constructors use `string`, while many returns `char[]`, 
causing me to do a lot of conv.to!string. And many times I have 
to fight with the excessive template error messages.

Is there a reason to use `string` instead of `in char[]` in 
function arguments? Do you tend to change those phobos functions?
Mar 18 2016
next sibling parent Kagamin <spam here.lot> writes:
When a string is not an in parameter, it can't be declared `in 
char[]`.
Mar 18 2016
prev sibling parent reply Jonathan M Davis via Digitalmars-d-learn writes:
On Friday, March 18, 2016 08:24:24 Puming via Digitalmars-d-learn wrote:
 Hi,

 I saw from the forum that functions with string like arguments
 better use `in char[]` instead of `string` type, because then it
 can accept both string and char[] types.

 But recently when actually using D, I found that many phobos
 functions/constructors use `string`, while many returns `char[]`,
 causing me to do a lot of conv.to!string. And many times I have
 to fight with the excessive template error messages.

 Is there a reason to use `string` instead of `in char[]` in
 function arguments? Do you tend to change those phobos functions?
When a function accepts const(char)[] than it can accept char[], const(char)[], const(char[]), immutable(char)[], and immutable(char[]), which, whereas if it accepts string, then all it accepts are immutable(char)[] and immutable(char[]). So, it's more restrictive, but if you need to return a slice of the array you passed in, if your function accepts const rather than mutable or immutable, then the slice has to be const, and you've lost the type information, which is why inout exists - e.g. if you have inout(char)[] and return the array that you're given, then its constness doesn't change. But inout only works when you return the same type as you pass in, and the function needed to store the string somewhere (e.g. this if it were a property function for setting a member variable), then accepting string would make more sense if it stores string. Otherwise, it would have to allocate a new string (and storing const(char)[] would risk having it change after it was passed to the function). So, the exact constness that should be used depends heavily on what the function is doing. Ali's dconf 2013 talk discusses some of these issues: http://dconf.org/2013/talks/cehreli.html Most functions in Phobos that operate on strings actually are templatized so that they work with varying constness and character type - either that, or they're templatized and operate on arbitrary ranges and not arrays specifically at all. That avoids most of these issues but does mean that the function needs to be templated. I don't know what you're using in Phobos that takes string and returns char[]. That implies an allocation, and if the function is pure, char[] may have been selected, because it could be implicitly converted to string thanks to the fact that the compiler could prove that the char[] being returned had to have been allocated in the function and that there could be no other references to that array. But without knowing exactly which functions you're talking about, I can't really say. In general though, the solution that we've gone with is to templatize functions that operate on strings, and a function that's taking a string explicitly is most likely storing it, in which case, it needs an explicit type, and using an immutable value ensures that it doesn't change later. If you want better insight into what the functions you're referring to do and why, then you'll need to be specific about which ones you're talking about. In any case, in general, the approach that Phobos takes is to operate on ranges of characters and only occasionally uses arrays specifically - except in cases where the value needs to be stored, in which case, string is typically what's used. It used to be that explicit strings were used more, but we've been moving to using ranges as much as possible, so actually seeing string in Phobos should be fairly rare and getting rarer. On a side note, I'd strongly argue against using "in" on function arguments that aren't delegates. in is equivalent to const scope, and scope currently does nothing for any types other than delegates - but it might later, in which case, you could be forced to change your code, depending on the exact semantics of scope for non-delegates. But it does _nothing_ now with non-delegate types regardless, so it's a meaningless attribute that might change meaning later, which makes using it a very bad idea IMHO. Just use const if you want const and leave scope for delegates. I'd actually love to see in deprecated, because it adds no value to the language (since it's equivalent to const scope, which you can use explicitly), and it hides the fact that scope is used. - Jonathan M Davis
Mar 18 2016
parent Puming <zhaopuming gmail.com> writes:
On Friday, 18 March 2016 at 20:06:27 UTC, Jonathan M Davis wrote:
 When a function accepts const(char)[] than it can accept char[],
 const(char)[], const(char[]), immutable(char)[], and 
 immutable(char[]),
 which, whereas if it accepts string, then all it accepts are
 immutable(char)[] and immutable(char[]). So, it's more
So I need to use const(char)[] in my function definitions instead of in char[]?
 restrictive, but if
 you need to return a slice of the array you passed in, if your 
 function
 accepts const rather than mutable or immutable, then the slice 
 has to be
 const, and you've lost the type information, which is why inout 
 exists -
Well, I never got inout until now, thanks!
 [...]
 I don't know what you're using in Phobos that takes string and 
 returns char[]. That implies an allocation, and if the function 
 is pure, char[] may have been selected, because it could be 
 implicitly converted to string thanks to the fact that the 
 compiler could prove that the char[] being returned had to have 
 been allocated in the function and that there could be no other 
 references to that array. But without knowing exactly which 
 functions you're talking about, I can't really say. In general 
 though, the solution that we've gone with is to templatize 
 functions that operate on strings, and a function that's taking 
 a string explicitly is most likely storing it, in which case, 
 it needs an explicit type, and using an immutable value ensures 
 that it doesn't change later.
I just got this feeling from using functions in the std.file module, like dirEntries and File constructor itself. After reading your explaination, it makes sense now. And with a second look up, most functions there ARE alread templatized. Thanks for your clarification.
 On a side note, I'd strongly argue against using "in" on 
 function arguments that aren't delegates. in is equivalent to 
 const scope, and scope currently does nothing for any types 
 other than delegates - but it might later, in which case, you 
 could be forced to change your code, depending on the exact 
 semantics of scope for non-delegates. But it does _nothing_ now 
 with non-delegate types regardless, so it's a meaningless 
 attribute that might change meaning later, which makes using it 
 a very bad idea IMHO. Just use const if you want const and 
 leave scope for delegates. I'd actually love to see in 
 deprecated, because it adds no value to the language (since 
 it's equivalent to const scope, which you can use explicitly), 
 and it hides the fact that scope is used.
Well, this is too complicated level for me now. I'll get to that later when I learn more with the language. My take away from your post: - when the function is pure for the stringlike, use 'const(char)[]' or 'inout(char)[]' when neccessary. - when the argument is stored in the function, use string. - manually convert stringlike objects to string with to!string when calling those functions. are those above correct?
 - Jonathan M Davis
Mar 29 2016