digitalmars.D - Re: Questions about Unicode, particularly Japanese
- Ruslan Nikolaev <nruslan_devel yahoo.com> Jun 08 2010
- "Steven Schveighoffer" <schveiguy yahoo.com> Jun 08 2010
- "Nick Sabalausky" <a a.a> Jun 08 2010
Sorry, if it's again top post in your mail clients. I'll try to figure out what's going on later today.1. Am I correct in all of that?
Yes. That's the reason I was saying that UTF-16 is *NOT* a lousy encoding. It really depends on a situation. The advantage is not only space but also faster processing speed (even for 2 byte letters: Greek, Cyrillic, etc.) since those 2 bytes can be read at one memory access as opposed to UTF-8. Also, consider another thing: it's easier (and cheaper) to convert from ANSI to UTF-16 since a direct table can be created. Whereas for UTF-8, you'll have to do some shifts to create a surrogate for non-ASCII letters (even for Latin ones). What encoding is better depends on your taste, language, applications, etc. I was simply pointing out that it's quite nice to have universal 'tchar' type. My argument was never about which encoding is better - it's hard to tell in general. Besides, many people still use ANSI and not UTF-8.
Jun 08 2010
On Tue, 08 Jun 2010 16:18:54 -0400, Ruslan Nikolaev <nruslan_devel yahoo.com> wrote:Sorry, if it's again top post in your mail clients. I'll try to figure out what's going on later today.
It appears as a top-post in my newsreader too.1. Am I correct in all of that?
Yes. That's the reason I was saying that UTF-16 is *NOT* a lousy encoding. It really depends on a situation. The advantage is not only space but also faster processing speed (even for 2 byte letters: Greek, Cyrillic, etc.) since those 2 bytes can be read at one memory access as opposed to UTF-8. Also, consider another thing: it's easier (and cheaper) to convert from ANSI to UTF-16 since a direct table can be created. Whereas for UTF-8, you'll have to do some shifts to create a surrogate for non-ASCII letters (even for Latin ones). What encoding is better depends on your taste, language, applications, etc. I was simply pointing out that it's quite nice to have universal 'tchar' type. My argument was never about which encoding is better - it's hard to tell in general. Besides, many people still use ANSI and not UTF-8.
Wouldn't this suggest that the decision of what character type to use would be more suited to what language you speak than what OS you are running? -Steve
Jun 08 2010
"Ruslan Nikolaev" <nruslan_devel yahoo.com> wrote in message news:mailman.138.1276028343.24349.digitalmars-d puremagic.com...Sorry, if it's again top post in your mail clients. I'll try to figure out what's going on later today.1. Am I correct in all of that?
Yes. That's the reason I was saying that UTF-16 is *NOT* a lousy encoding. It really depends on a situation. The advantage is not only space but also faster processing speed (even for 2 byte letters: Greek, Cyrillic, etc.) since those 2 bytes can be read at one memory access as opposed to UTF-8. Also, consider another thing: it's easier (and cheaper) to convert from ANSI to UTF-16 since a direct table can be created. Whereas for UTF-8, you'll have to do some shifts to create a surrogate for non-ASCII letters (even for Latin ones).
Yea, I need to remember not to try to post late at night ;)
Jun 08 2010