www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - strings and ranges

reply "Jason den Dulk" <public2 jasondendulk.com> writes:
Hello.

When working with my code I noticed that if I use front on a 
char[], it yields a dchar. Am I correct in concluding that it 
does a UTF-8 to UTF-32 conversion and popFont will skip the whole 
character, not just a code unit?

Also, does this mean that if I'm creating an output range for 
char[], will I need to implement a put(dchar) as well as a 
put(char)?

Thanks
Regards
Jason
Aug 14 2013
next sibling parent "anonymous" <anonymous example.com> writes:
On Thursday, 15 August 2013 at 00:49:00 UTC, Jason den Dulk wrote:
 When working with my code I noticed that if I use front on a 
 char[], it yields a dchar. Am I correct in concluding that it 
 does a UTF-8 to UTF-32 conversion and popFont will skip the 
 whole character, not just a code unit?
yup
 Also, does this mean that if I'm creating an output range for 
 char[], will I need to implement a put(dchar) as well as a 
 put(char)?
I think you don't need put(char). put(char[]) or put(const(char)[]) could be worthwhile to prevent decoding. But put(dchar) alone would suffice.
Aug 14 2013
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Thursday, August 15, 2013 02:48:58 Jason den Dulk wrote:
 Hello.
 
 When working with my code I noticed that if I use front on a
 char[], it yields a dchar. Am I correct in concluding that it
 does a UTF-8 to UTF-32 conversion and popFont will skip the whole
 character, not just a code unit?
 
 Also, does this mean that if I'm creating an output range for
 char[], will I need to implement a put(dchar) as well as a
 put(char)?
All strings are treated as ranges of dchar when using the range APIs, so you pretty much don't do anything with char or wchar where ranges are concerned unless you're optimizing a particular function for narrow strings. There is no reason to implement put(char), just put(dchar). Range-based code shouldn't generally care what type of string it's dealing with, so you wouldn't normally be writing any range-based code that cared about char[] unless you're optimizing a particular function's implementation (in which case, all of that would be internal to the function and wouldn't affect its semantics). Here are a couple of stackoverflow questions that discuss ranges and strings. Perhaps, you'll find them useful. http://stackoverflow.com/questions/16590650/how-to-read-a-string-character-by-character-as-a-range-in-d http://stackoverflow.com/questions/12288465/std-algorithm-joinerstring-string-why-result-elements-are-dchar-and-not-ch - Jonathan M Davis P.S. I really should finish writing the article that I started explaining ranges. So much to do, so little time.
Aug 15 2013
prev sibling parent "monarch_dodra" <monarchdodra gmail.com> writes:
On Thursday, 15 August 2013 at 00:49:00 UTC, Jason den Dulk wrote:
 Also, does this mean that if I'm creating an output range for 
 char[], will I need to implement a put(dchar) as well as a 
 put(char)?
Unfortunately, right now, yes. "put" doesn't know how to convert on the fly to the right type. However, I have an open pull request so that anything that accepts some form of character, or character string, can be feed any form of character, or character stream.
Aug 15 2013