www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 10717] New: std.ascii.toLower and toUpper should return char instead of dchar and avoid me to use a bad cast(char)

reply d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10717

           Summary: std.ascii.toLower and toUpper should return char
                    instead of dchar and avoid me to use a bad cast(char)
           Product: D
           Version: D2
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Phobos
        AssignedTo: nobody puremagic.com
        ReportedBy: bearophile_hugs eml.cc



If I use functions from std.ascii I am stating that I am working with ASCII
chars.

Also std.ascii docs say: "All of the functions in std.ascii accept unicode
characters but effectively ignore them."

So if I use std.ascii.toLower on an ASCII char, in most times (all times so
far) I want the result to be a char. If std.ascii.toLower returns a dchar it
forces me to use a cast(char) in most (or all) cases. Such cast doesn't
increase safety at all, it actually decreases it.

So I'd like std.ascii.toLower and std.ascii.toUpper to return char.

(This has a severity 'normal' instead of 'enhancement' because I regard that as
a bug in the design of those two functions.)

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 26 2013
next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10717


Jonathan M Davis <jmdavisProg gmx.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jmdavisProg gmx.com
           Severity|normal                      |enhancement



PDT ---
I really don't think that this is a bug. By taking dchar, they work with ranges
of dchar and allow you to operate on strings that contain Unicode in cases
where all you care about are particular ASCII characters (like when you only
care about ASCII whitespace in a string - not Unicode whitespace - but the
string contains Unicode characters). And as soon the functions accept dchar,
they must return dchar, or they'll destroy any Unicode characters that get
passed in.

A reasonable enhancement request would be to overload these functions with
overloads which take char and return char, but there is actual value in having
them accept and return dchar, and it would break existing code if they stopped.
So, the functions that are there are there to stay, but you may get overloads
which operate specifically on char.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 26 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10717






 So, the functions that are there are there to stay, but you may get overloads
 which operate specifically on char.
OK. Thank for your comments and corrections Jonathan. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jul 26 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10717


Jonathan M Davis <jmdavisProg gmx.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |pull



PDT ---
https://github.com/D-Programming-Language/phobos/pull/1436

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 27 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10717





 https://github.com/D-Programming-Language/phobos/pull/1436
Thank you again Jonathan :-) -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jul 27 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10717


monarchdodra gmail.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |monarchdodra gmail.com





 https://github.com/D-Programming-Language/phobos/pull/1436
Thank you again Jonathan :-)
The new semantics are: //---- If $(D c) is an uppercase ASCII character, then its corresponding lowercase letter is returned. Otherwise, $(D c) is returned. $(D C) can be any type which implicitly converts to $(D dchar). In the case where it's a built-in type, $(D Unqual!C) is returned, whereas if it's a user-defined type, $(D dchar) is returned. //---- Does this fit the bill, or do you see anything else that needs to be addressed? -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jul 31 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10717






 Does this fit the bill, or do you see anything else that needs to be addressed?
I have tried it, and the semantics seems OK. But the implementation is: auto toLower(C)(C c) if(is(C : dchar)) { static if(isScalarType!C) return isUpper(c) ? cast(Unqual!C)(c + 'a' - 'A') : cast(Unqual!C)c; else return toLower!dchar(c); } So a call to toLower can cause two function calls. toLower() is a tiny function that could be called many many times, so perhaps in debug mode (without inlining) doubling the number of function calls slows down the user code a bit. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jul 31 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10717





 So a call to toLower can cause two function calls. toLower() is a tiny function
 that could be called many many times, so perhaps in debug mode (without
 inlining) doubling the number of function calls slows down the user code a bit.
Well, it can cause two function calls only for user defined types, so the case should not be that common. Doing it this way means you'll only end up instanciating `toLower!dchar` for all user types. Which is also a plus. One of the implementation I had suggested was: ////-------- //Public template that only filters and changes return type auto toUpper(C)(C c) safe pure nothrow if (is(C : dchar)) out(result) { assert(!isLower(result)); } body { alias Ret = Select!(isScalarType!C, C, dchar); return cast(Ret) toUpperImpl(c); } //Non template that does actual job. private dchar toUpperImpl(dchar c) safe pure nothrow { return isLower(c) ? cast(dchar)(c - ('a' - 'A')) : c; } ////-------- -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jul 31 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10717




Commits pushed to master at https://github.com/D-Programming-Language/phobos

https://github.com/D-Programming-Language/phobos/commit/28c1a8480a61054e3ea64b3e55b8c90a3c1754a0


This makes it so that if you're operating on chars or wchars with
std.ascii functions, you don't have to cast the return values back to
char or wchar, but it still allows you to operate on dchars without
mangling any Unicode values.

https://github.com/D-Programming-Language/phobos/commit/f7a593fe9cb0567bbb59a79653ae3a8538ffc7b0




Merged.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Aug 02 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10717


Jonathan M Davis <jmdavisProg gmx.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED


-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Aug 02 2013
prev sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10717




Commits pushed to master at https://github.com/D-Programming-Language/phobos

https://github.com/D-Programming-Language/phobos/commit/978250b3aa23b87f08c37dccb7642f5c77c15561
Fix Issue 10717 - std.ascii.toLower and toUpper should return char instead of
dchar and avoid me to use a bad cast(char)

It turns out it was incorrectly working for enum types.

New semantics are that enums are also accepted, and their OriginalType is
returned.

https://github.com/D-Programming-Language/phobos/commit/f621b5692f575fc83ec588d10d93dbb222709663


Fix Issue 10717 - std.ascii.toLower and toUpper should return char inste...

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Sep 22 2013