www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Convert string to wchar.

reply Jacob Carlborg <doob me.com> writes:
I tried to convert a string into a wchar, but that didn't compile 
because of this template constraint:

https://github.com/D-Programming-Language/phobos/blob/master/std/conv.d#L1770

Is there a way to convert a string into a wchar?

-- 
/Jacob Carlborg
Aug 02 2011
parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
 I tried to convert a string into a wchar, but that didn't compile
 because of this template constraint:
 
 https://github.com/D-Programming-Language/phobos/blob/master/std/conv.d#L17
 70
 
 Is there a way to convert a string into a wchar?
Does that even make sense? What do you want it to do, convert the first code point to a wchar and throw if there's more than one character in the string? That's like asking whether you can covert between a container of ints and an int. I would never expect std.conv.to to support that. Not to mention, you shouldn't normally be using char or wchar by themselves, because they might not be valid code points. Normally, only dchar should be used when representing an individual character. If you want this, I'd suggest that you simply do something like cast(wchar)str.front What you're asking for is inherently unsafe as far as unicode goes. - Jonathan M Davis
Aug 02 2011
parent reply Jacob Carlborg <doob me.com> writes:
On 2011-08-02 19:51, Jonathan M Davis wrote:
 I tried to convert a string into a wchar, but that didn't compile
 because of this template constraint:

 https://github.com/D-Programming-Language/phobos/blob/master/std/conv.d#L17
 70

 Is there a way to convert a string into a wchar?
Does that even make sense? What do you want it to do, convert the first code point to a wchar and throw if there's more than one character in the string? That's like asking whether you can covert between a container of ints and an int. I would never expect std.conv.to to support that. Not to mention, you shouldn't normally be using char or wchar by themselves, because they might not be valid code points. Normally, only dchar should be used when representing an individual character. If you want this, I'd suggest that you simply do something like cast(wchar)str.front What you're asking for is inherently unsafe as far as unicode goes. - Jonathan M Davis
I'm working on a serialization library and I intend to support as many types as possible. So if someone serializes a single wchar I need to be able to deserialize it. Since the serialized data is represented by a string, in this case, I need to convert a string containing a single character to a wchar when deserializing. Yes, convert the first code point to a wchar and then throw if there's more the one character in the string. -- /Jacob Carlborg
Aug 02 2011
next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday 03 August 2011 08:29:09 Jacob Carlborg wrote:
 On 2011-08-02 19:51, Jonathan M Davis wrote:
 I tried to convert a string into a wchar, but that didn't compile
 because of this template constraint:
 
 https://github.com/D-Programming-Language/phobos/blob/master/std/conv.
 d#L17 70
 
 Is there a way to convert a string into a wchar?
Does that even make sense? What do you want it to do, convert the first code point to a wchar and throw if there's more than one character in the string? That's like asking whether you can covert between a container of ints and an int. I would never expect std.conv.to to support that. Not to mention, you shouldn't normally be using char or wchar by themselves, because they might not be valid code points. Normally, only dchar should be used when representing an individual character. If you want this, I'd suggest that you simply do something like cast(wchar)str.front What you're asking for is inherently unsafe as far as unicode goes. - Jonathan M Davis
I'm working on a serialization library and I intend to support as many types as possible. So if someone serializes a single wchar I need to be able to deserialize it. Since the serialized data is represented by a string, in this case, I need to convert a string containing a single character to a wchar when deserializing. Yes, convert the first code point to a wchar and then throw if there's more the one character in the string.
Well, while it's understandable that you have to cover pretty every possible case of converting to and from a string with what you're doing, I don't think that it's at all reasonable to have std.conv.to convert a string to any character type, let alone one other than dchar. It's almost always a horrible idea and should _not_ be encouraged. So, I'd advise you to just find a way to deal with it appropriately in your own code. I think that it would be a very bad idea for std.conv.to or anything else in Phobos to support such a conversion. - Jonathan M Davis
Aug 02 2011
parent Jacob Carlborg <doob me.com> writes:
On 2011-08-03 08:38, Jonathan M Davis wrote:
 On Wednesday 03 August 2011 08:29:09 Jacob Carlborg wrote:
 On 2011-08-02 19:51, Jonathan M Davis wrote:
 I tried to convert a string into a wchar, but that didn't compile
 because of this template constraint:

 https://github.com/D-Programming-Language/phobos/blob/master/std/conv.
 d#L17 70

 Is there a way to convert a string into a wchar?
Does that even make sense? What do you want it to do, convert the first code point to a wchar and throw if there's more than one character in the string? That's like asking whether you can covert between a container of ints and an int. I would never expect std.conv.to to support that. Not to mention, you shouldn't normally be using char or wchar by themselves, because they might not be valid code points. Normally, only dchar should be used when representing an individual character. If you want this, I'd suggest that you simply do something like cast(wchar)str.front What you're asking for is inherently unsafe as far as unicode goes. - Jonathan M Davis
I'm working on a serialization library and I intend to support as many types as possible. So if someone serializes a single wchar I need to be able to deserialize it. Since the serialized data is represented by a string, in this case, I need to convert a string containing a single character to a wchar when deserializing. Yes, convert the first code point to a wchar and then throw if there's more the one character in the string.
Well, while it's understandable that you have to cover pretty every possible case of converting to and from a string with what you're doing, I don't think that it's at all reasonable to have std.conv.to convert a string to any character type, let alone one other than dchar. It's almost always a horrible idea and should _not_ be encouraged. So, I'd advise you to just find a way to deal with it appropriately in your own code. I think that it would be a very bad idea for std.conv.to or anything else in Phobos to support such a conversion. - Jonathan M Davis
Ok, fair enough. -- /Jacob Carlborg
Aug 03 2011
prev sibling parent reply Pelle <pelle.mansson gmail.com> writes:
On Wed, 03 Aug 2011 08:29:09 +0200, Jacob Carlborg <doob me.com> wrote:
 Yes, convert the first code point to a wchar and then throw if there's  
 more the one character in the string.
Not tested, and I might be wrong, but 'to!' should work between dchar and wchar, no? wchar to_wchar(string s) { auto c = s.front; s.popFront(); assert (s.empty); return to!wchar(c); }
Aug 03 2011
next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday 03 August 2011 09:34:53 Pelle wrote:
 On Wed, 03 Aug 2011 08:29:09 +0200, Jacob Carlborg <doob me.com> wrote:
 Yes, convert the first code point to a wchar and then throw if there's
 more the one character in the string.
Not tested, and I might be wrong, but 'to!' should work between dchar and wchar, no? wchar to_wchar(string s) { auto c = s.front; s.popFront(); assert (s.empty); return to!wchar(c); }
It's debatable as to whether std.conv.to should be able to convert between code units like that (you _really_ shouldn't ever be using char or wchar outside of arrays or other ranges), but it does appear to compile, for better or worse. - Jonathan M Davis
Aug 03 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday 03 August 2011 01:02:02 Jonathan M Davis wrote:
 On Wednesday 03 August 2011 09:34:53 Pelle wrote:
 On Wed, 03 Aug 2011 08:29:09 +0200, Jacob Carlborg <doob me.com> wrote:
 Yes, convert the first code point to a wchar and then throw if
 there's
 more the one character in the string.
Not tested, and I might be wrong, but 'to!' should work between dchar and wchar, no? wchar to_wchar(string s) { auto c = s.front; s.popFront(); assert (s.empty); return to!wchar(c); }
It's debatable as to whether std.conv.to should be able to convert between code units like that (you _really_ shouldn't ever be using char or wchar outside of arrays or other ranges), but it does appear to compile, for better or worse.
It looks like the conversion works as long as the character in question will fit in the character type that you're converting to. If it doesn't fit, then it throws. So, it's as safe as such a conversion can be (though it still isn't generally a good idea to use individual chars or wchars in code). - Jonathan M Davis
Aug 03 2011
prev sibling parent Jacob Carlborg <doob me.com> writes:
On 2011-08-03 09:34, Pelle wrote:
 On Wed, 03 Aug 2011 08:29:09 +0200, Jacob Carlborg <doob me.com> wrote:
 Yes, convert the first code point to a wchar and then throw if there's
 more the one character in the string.
Not tested, and I might be wrong, but 'to!' should work between dchar and wchar, no? wchar to_wchar(string s) { auto c = s.front; s.popFront(); assert (s.empty); return to!wchar(c); }
Ok, thanks. -- /Jacob Carlborg
Aug 03 2011