digitalmars.D - std.string.toupper/tolower failed with mixture of Engish and Chinese characters
- Shawn Liu <Shawn_member pathlink.com> Nov 21 2005
- "Kris" <fu bar.com> Nov 21 2005
- Thomas Kuehne <thomas-dloop kuehne.cn> Nov 26 2005
- Derek Parnell <derek psych.ward> Nov 21 2005
std.string.toupper() and std.string.tolower() give a wrong result when deal with a mixture of upper/lower English and Chinese characters. e.g. char[] a = "AbCdÖŠeFgH"; char[] b = std.string.toupper(a); char[] c = std.string.tolower(a); The length of a is 11, but the length of b,c is 18 now.
Nov 21 2005
"Shawn Liu" <Shawn_member pathlink.com> wrote...std.string.toupper() and std.string.tolower() give a wrong result when deal with a mixture of upper/lower English and Chinese characters. e.g. char[] a = "AbCdÖŠeFgH"; char[] b = std.string.toupper(a); char[] c = std.string.tolower(a); The length of a is 11, but the length of b,c is 18 now.
Phobos doesn't supports non-ascii conversions/comparisons at this time?
Nov 21 2005
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 [follow up set to: digitalmars.D.bugs] Kris schrieb am 2005-11-22:"Shawn Liu" <Shawn_member pathlink.com> wrote...std.string.toupper() and std.string.tolower() give a wrong result when deal with a mixture of upper/lower English and Chinese characters. e.g. char[] a = "AbCdÖŠeFgH"; char[] b = std.string.toupper(a); char[] c = std.string.tolower(a); The length of a is 11, but the length of b,c is 18 now.
Phobos doesn't supports non-ascii conversions/comparisons at this time?
Phobos does, at least the simple conversions. No matter what cases are treated, the untreated data shouldn't get corrupted. The attached zipped string.d fixes toupper/tolower and extends the unittests. (Yes I know, it isn't the fastest possible algorithm ...) Thomas
Nov 26 2005
On Tue, 22 Nov 2005 02:19:50 +0000 (UTC), Shawn Liu wrote:std.string.toupper() and std.string.tolower() give a wrong result when deal with a mixture of upper/lower English and Chinese characters. e.g. char[] a = "AbCdÖŠeFgH"; char[] b = std.string.toupper(a); char[] c = std.string.tolower(a); The length of a is 11, but the length of b,c is 18 now.
If it isn't ASCII then DMD doesn't want to know about it. Try the Mango library for its ICU bindings, I think that might have it. -- Derek (skype: derek.j.parnell) Melbourne, Australia 22/11/2005 1:33:24 PM
Nov 21 2005









Thomas Kuehne <thomas-dloop kuehne.cn> 