digitalmars.D - State of the Unicode in D

Walter Bright (3/3) Jul 29 2011 http://training.perl.com/OSCON2011/index.html

Andrej Mitrovic (2/2) Jul 29 2011 That \N syntax sugar could easily be replacable with a Phobos function

KennyTM~ (9/11) Jul 29 2011 Possible, but don't do it :). The table would have like 0x18000 entries

Walter Bright (2/5) Jul 29 2011 One problem: http://d.puremagic.com/issues/show_bug.cgi?id=6403

Dmitry Olshansky (17/24) Jul 30 2011 Let me expand a bit my reply on bugzilla.

Walter Bright (2/15) Jul 30 2011 Sounds great!

Walter Bright <newshound2 digitalmars.com> writes:

http://training.perl.com/OSCON2011/index.html

This is a good starting point for seeing where we are with Unicode support and 
where we need to go.

Jul 29 2011

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

That \N syntax sugar could easily be replacable with a Phobos function
called via CTFE.

Jul 29 2011

KennyTM~ <kennytm gmail.com> writes:

On Jul 30, 11 07:37, Andrej Mitrovic wrote:
 That \N syntax sugar could easily be replacable with a Phobos function
 called via CTFE.

Possible, but don't do it :). The table would have like 0x18000 entries 
(just a guess). If each character name is 20 letter long, Phobos need to 
supply a 2 MB file for this rarely used feature. Besides, D has '\&afr;' 
already.

There are more important features like Unicode properties, normalization 
(á <-> a´), locale-specific casing (dotless i), collation etc. that 
should be supported before having \N.

(I'd prefer these be done via a wrapper to ICU, as most are database-based.)

Jul 29 2011

Walter Bright <newshound2 digitalmars.com> writes:

On 7/29/2011 4:24 PM, Walter Bright wrote:
 http://training.perl.com/OSCON2011/index.html

 This is a good starting point for seeing where we are with Unicode support and
 where we need to go.

One problem: http://d.puremagic.com/issues/show_bug.cgi?id=6403

Jul 29 2011

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 30.07.2011 5:21, Walter Bright wrote:
 On 7/29/2011 4:24 PM, Walter Bright wrote:
 http://training.perl.com/OSCON2011/index.html

 This is a good starting point for seeing where we are with Unicode 
 support and
 where we need to go.

 One problem: http://d.puremagic.com/issues/show_bug.cgi?id=6403

Let me expand a bit my reply on bugzilla.
There are other things I'd like to note, besides conforming to unicode 
regex standard, that is (going to be) fully supported in  upcoming 
next-gen std.regex.
Things I'd love to see in an upgrade of std.uni:
     - normalization (at least NFC)
     - unicode version 5.0 ---> 6.0
     - grapheme support, via a special range on top of string or at 
least plain "stride" function that tells the length of a cluster a-la 
the one that does UTF-8 decoding
I had to (re)implement a lot of stuff, with the end result that the 
unicode support in regex is self-contained right now.
Of course, I'd be willing to make arrangements to gradually shift some 
of this stuff back where it belongs, once I'm finished with regexes.

-- 
Dmitry Olshansky

Jul 30 2011

Walter Bright <newshound2 digitalmars.com> writes:

On 7/30/2011 12:09 PM, Dmitry Olshansky wrote:
 Let me expand a bit my reply on bugzilla.
 There are other things I'd like to note, besides conforming to unicode regex
 standard, that is (going to be) fully supported in upcoming next-gen std.regex.
 Things I'd love to see in an upgrade of std.uni:
 - normalization (at least NFC)
 - unicode version 5.0 ---> 6.0
 - grapheme support, via a special range on top of string or at least plain
 "stride" function that tells the length of a cluster a-la the one that does
 UTF-8 decoding
 I had to (re)implement a lot of stuff, with the end result that the unicode
 support in regex is self-contained right now.
 Of course, I'd be willing to make arrangements to gradually shift some of this
 stuff back where it belongs, once I'm finished with regexes.

Sounds great!

Jul 30 2011

D Programming

C/C++ Programming

Other

digitalmars.D - State of the Unicode in D