www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 1347] New: invalid UTF-8 strings cause access violations and inconsistent behavior in std.regexp

reply d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=1347

           Summary: invalid UTF-8 strings cause access violations and
                    inconsistent behavior in std.regexp
           Product: D
           Version: 1.018
          Platform: PC
        OS/Version: Windows
            Status: NEW
          Severity: minor
          Priority: P3
         Component: Phobos
        AssignedTo: bugzilla digitalmars.com
        ReportedBy: thecybershadow gmail.com


import std.regexp;

void main()
{
        ubyte[] data = [0xFF];
        RegExp re = new RegExp(`.*`);
        re.test(cast(char[])data);
}

---
Caused me some headache when I to process some non-Unicode files and forgot to
convert the data.


-- 
Jul 18 2007
parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=1347


bugzilla digitalmars.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |WONTFIX





std.regexp is designed to work only with valid UTF strings. To validate UTF
strings, which should be done for input coming from an untrusted source, use
the function std.utf.validate().


-- 
Sep 03 2007