digitalmars.D.bugs - [Issue 9173] New: std.string.wrap should conform to Unicode line-breaking algorithm
- d-bugmail puremagic.com (35/35) Dec 17 2012 http://d.puremagic.com/issues/show_bug.cgi?id=9173
http://d.puremagic.com/issues/show_bug.cgi?id=9173 Summary: std.string.wrap should conform to Unicode line-breaking algorithm Product: D Version: D2 Platform: All OS/Version: All Status: NEW Severity: enhancement Priority: P2 Component: Phobos AssignedTo: nobody puremagic.com ReportedBy: hsteoh quickfur.ath.cx --- Comment #0 from hsteoh quickfur.ath.cx 2012-12-17 13:24:08 PST --- Currently, there are some issues with std.string.wrap: 1) It uses std.uni.isWhite as criterion for line-breaking opportunities, but isWhite includes such things as non-breaking space, which should *not* be wrapped. It also includes things like vowel mark separators, which shouldn't be wrapped, either. 2) It does not take zero-width characters and combining diacritics into account when counting columns, which means that it will sometimes wrap the line at the wrong place. 3) It does not wrap CJK text or Thai text correctly. For reference, here's the Unicode technical reference that describes proper line-breaking of Unicode text: http://www.unicode.org/reports/tr14/ (After having read through TR14, I was in awe at how insanely complicated line-wrapping in Unicode is. So I'd propose that, if nothing else, we should fix items (1) and (2) above, which should be within the reach of a relatively simple-to-implement European-centric line wrapping algorithm. People who want CJK wrapping or other complicated stuff probably want to be writing their own algo anyway.) -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Dec 17 2012