www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 1937] New: std.uri.decode throws wrong exception

reply d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=1937

           Summary: std.uri.decode throws wrong exception
           Product: D
           Version: 1.028
          Platform: PC
        OS/Version: Windows
            Status: NEW
          Keywords: diagnostic
          Severity: normal
          Priority: P2
         Component: Phobos
        AssignedTo: bugzilla digitalmars.com
        ReportedBy: stefan.zipproth web.de


If the parameter of std.uri.decode contains %E4, which is a German umlaut ä,
the exception "URI error" is thrown. This is wrong behaviour, as an URI may
contains umlauts. Here's my C implementation which does the job:

void decode(char *src, char *last, char *dest)
{
  for (; src != last; src++, dest++)
    if (*src == '+')
      *dest = ' ';
    else if (*src == '%')
    {
      int code;
      if (sscanf(src+1, "%2x", &code) != 1) code = '?';
      *dest = code;
      src +=2;
    }
    else
      *dest = *src;

    *dest = 0;
}

To my understanding, it's nothing else than a hex code to byte conversion, so
there should be no reason to forbid certain codes and throw exceptions. Also
reading the documentation I expected that at least std.uri.decodeComponent is a
straightforward implementation, but it also throws exceptions.


-- 
Mar 24 2008
parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=1937


stefan.zipproth web.de changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |stefan.zipproth web.de
             Status|NEW                         |RESOLVED
         Resolution|                            |INVALID




------- Comment #1 from stefan.zipproth web.de  2008-03-24 18:51 -------
My own web pages currently work with %E4 for umlaut 'ä', which is ASCII code,
but standard is UTF-8 which makes it %C3%A4. So this issue seems to be invalid
and I have to change things for the D port of my web application.


-- 
Mar 24 2008