www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 11350] New: libphobos2 regex match segfaults when a rare HTTP header is received

reply d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=11350

           Summary: libphobos2 regex match segfaults when a rare HTTP
                    header is received
           Product: D
           Version: D2
          Platform: x86
        OS/Version: Linux
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Phobos
        AssignedTo: nobody puremagic.com
        ReportedBy: sha0 badchecksum.net


--- Comment #0 from sha0coder <sha0 badchecksum.net> 2013-10-25 03:30:36 PDT ---
A simple std.net.curl.get() is performed to a remote host, which contains some
rare http headers, (I don't define the onReceiveHeader callback) but the
liphobos2 call to the default onReceiveHeader() which apply a regex to the
header, and then crashes.

I connect on this way:

    auto conn = HTTP();
    conn.connectTimeout(dur!"seconds"(4));
    conn.addRequestHeader("User-agent","Mozilla/5.0 (Windows NT 6.1; rv:20.0)
Gecko/20100101 Firefox/20.0");
    char[] html = get(url,conn);


It seems the bug is at:

/usr/include/dmd/phobos/std/regex.d  line 6348

6537 public auto match(R, RegEx)(R input, RegEx re)
6538     if(isSomeString!R && is(RegEx == Regex!(BasicElementOf!R)))
6539 {
6540     return RegexMatch!(Unqual!(typeof(input)),ThompsonMatcher)(re, input);
6541 }

Maybe is an encoding problem, it seems the input is:
 print "%c%c%c%c%c%c%c%c%c" % (0x64,0x61,0x97,0x48,0x34,0x53,0x54,0x65,0x46)
da�H4STeF (gdb) bt #0 0xb76c8d13 in rt.deh2.terminate() () from /usr/lib/i386-linux-gnu/libphobos2.so.0.63 #1 0xb76c8ee3 in _d_throwc () from /usr/lib/i386-linux-gnu/libphobos2.so.0.63 #2 0x080b04cc in _D3std5regex49__T10RegexMatchTAaS273std5regex15ThompsonMatcherZ10RegexMatch43__T6__ctorTS3std5regex12__T5RegexTaZ5RegexZ6__ctorMFNcNeS3std5regex12__T5RegexTaZ5RegexAaZS3std5regex49__T10RegexMatchTAaS273std5regex15ThompsonMatcherZ10RegexMatch (this=0x95ac0774, input=646197483453546546, prog=...) at /usr/include/dmd/phobos/std/regex.d:6348 #3 0x080a09a2 in _D3std5regex45__T5matchTAaTS3std5regex12__T5RegexTaZ5RegexZ5matchFNfAaS3std5regex12__T5RegexTaZ5RegexZS3std5regex49__T10RegexMatchTAaS273std5regex15ThompsonMatcherZ10RegexMatch (__HID46=0x95ac0b18, re=..., input=646197483453546546) at /usr/include/dmd/phobos/std/regex.d:6540 #4 0xb768e20f in std.net.curl.HTTP.onReceiveHeader() () from /usr/lib/i386-linux-gnu/libphobos2.so.0.63 #5 0xb769125a in std.net.curl.Curl.onReceiveHeader() () from /usr/lib/i386-linux-gnu/libphobos2.so.0.63 #6 0xb7691665 in std.net.curl.Curl._receiveHeaderCallback() () from /usr/lib/i386-linux-gnu/libphobos2.so.0.63 #7 0xb72a5e7a in Curl_client_write () from /usr/lib/i386-linux-gnu/libcurl.so.4 #8 0xb72a4912 in Curl_http_readwrite_headers () from /usr/lib/i386-linux-gnu/libcurl.so.4 #9 0xb72bbf6d in Curl_readwrite () from /usr/lib/i386-linux-gnu/libcurl.so.4 #10 0xb72bde4d in ?? () from /usr/lib/i386-linux-gnu/libcurl.so.4 #11 0xb72be793 in curl_easy_perform () from /usr/lib/i386-linux-gnu/libcurl.so.4 #12 0xb7691093 in std.net.curl.Curl.perform() () from /usr/lib/i386-linux-gnu/libphobos2.so.0.63 #13 0xb768d8e1 in std.net.curl.HTTP._perform() () from /usr/lib/i386-linux-gnu/libphobos2.so.0.63 #14 0xb768d734 in std.net.curl.HTTP.perform() () from /usr/lib/i386-linux-gnu/libphobos2.so.0.63 #15 0x08081aac in _D3std3net4curl18__T10_basicHTTPTaZ10_basicHTTPFAxaAxvS3std3net4curl4HTTPZAa (client=..., sendData=579669917507256320, url=10576998119117946914) at /usr/include/dmd/phobos/std/net/curl.d:762 #16 0x08081948 in _D3std3net4curl30__T3getTS3std3net4curl4HTTPTaZ3getFAxaS3std3net4curl4HTTPZAa (conn=..., url=10576998119117946914) at /usr/include/dmd/phobos/std/net/curl.d:364 -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Oct 25 2013
next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=11350


Dmitry Olshansky <dmitry.olsh gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |dmitry.olsh gmail.com


--- Comment #1 from Dmitry Olshansky <dmitry.olsh gmail.com> 2013-10-25
11:21:26 PDT ---
(In reply to comment #0)
 
 It seems the bug is at:
 
 /usr/include/dmd/phobos/std/regex.d  line 6348
 
 6537 public auto match(R, RegEx)(R input, RegEx re)
 6538     if(isSomeString!R && is(RegEx == Regex!(BasicElementOf!R)))
 6539 {
 6540     return RegexMatch!(Unqual!(typeof(input)),ThompsonMatcher)(re, input);
 6541 }
 
 Maybe is an encoding problem, it seems the input is:
 print "%c%c%c%c%c%c%c%c%c" % (0x64,0x61,0x97,0x48,0x34,0x53,0x54,0x65,0x46)
da�H4STeF
Would be nice to see what pattern that is and how exactly the argument to it looks like. I tried to reproduce with this: void main() { import std.regex; ubyte[] header = [0x64,0x61,0x97,0x48,0x34,0x53,0x54,0x65,0x46]; auto m = match(cast(char[]) header, regex("(.*?): (.*)$")); assert(m.empty); } I get: std.utf.UTFException C:\dmd2\windows\bin\..\..\src\phobos\std\utf.d(1113): Invalid UTF-8 sequence (at index 1) No crashes. Now it may have to do with shared object / PIC code for all I know, as I'm testing on Win32. But w/o a smaller or at least complete reproduceble test-case there is nothing to work on. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Oct 25 2013
prev sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=11350



--- Comment #2 from Dmitry Olshansky <dmitry.olsh gmail.com> 2013-10-25
11:40:08 PDT ---
(In reply to comment #0)
 It seems the bug is at:
No and I think I know what it is.
 Maybe is an encoding problem, it seems the input is:
 print "%c%c%c%c%c%c%c%c%c" % (0x64,0x61,0x97,0x48,0x34,0x53,0x54,0x65,0x46)
da�H4STeF
Yes, this is broken UTF-8 and hence...
 
 
 
 (gdb) bt
 #0  0xb76c8d13 in rt.deh2.terminate() () from
 /usr/lib/i386-linux-gnu/libphobos2.so.0.63
 #1  0xb76c8ee3 in _d_throwc () from /usr/lib/i386-linux-gnu/libphobos2.so.0.63
it throws and exception ...
 #2  0x080b04cc in
 _D3std5regex49__T10RegexMatchTAaS273std5regex15ThompsonMatcherZ10RegexMatch43__T6__ctorTS3std5regex12__T5RegexTaZ5RegexZ6__ctorMFNcNeS3std5regex12__T5RegexTaZ5RegexAaZS3std5regex49__T10RegexMatchTAaS273std5regex15ThompsonMatcherZ10RegexMatch
 (this=0x95ac0774, input=646197483453546546, prog=...)
     at /usr/include/dmd/phobos/std/regex.d:6348
.. inside of std.regex.match. But the thing is - we are doing it inside of a callback of C-library CURL (browse the call stack to curl_easy_perform). IT HAS NO IDEA what to do with exception hence the crash. So the fix would be to insulate it with try/catch inside of that onRecieve callback.
 #3  0x080a09a2 in
 _D3std5regex45__T5matchTAaTS3std5regex12__T5RegexTaZ5RegexZ5matchFNfAaS3std5regex12__T5RegexTaZ5RegexZS3std5regex49__T10RegexMatchTAaS273std5regex15ThompsonMatcherZ10RegexMatch
 (__HID46=0x95ac0b18, re=..., input=646197483453546546) at
 /usr/include/dmd/phobos/std/regex.d:6540
 #4  0xb768e20f in std.net.curl.HTTP.onReceiveHeader() () from
 /usr/lib/i386-linux-gnu/libphobos2.so.0.63
 #5  0xb769125a in std.net.curl.Curl.onReceiveHeader() () from
 /usr/lib/i386-linux-gnu/libphobos2.so.0.63
 #6  0xb7691665 in std.net.curl.Curl._receiveHeaderCallback() () from
 /usr/lib/i386-linux-gnu/libphobos2.so.0.63
 #7  0xb72a5e7a in Curl_client_write () from
 /usr/lib/i386-linux-gnu/libcurl.so.4
 #8  0xb72a4912 in Curl_http_readwrite_headers () from
 /usr/lib/i386-linux-gnu/libcurl.so.4
 #9  0xb72bbf6d in Curl_readwrite () from /usr/lib/i386-linux-gnu/libcurl.so.4
 #10 0xb72bde4d in ?? () from /usr/lib/i386-linux-gnu/libcurl.so.4
 #11 0xb72be793 in curl_easy_perform () from
 /usr/lib/i386-linux-gnu/libcurl.so.4
 #12 0xb7691093 in std.net.curl.Curl.perform() () from
 /usr/lib/i386-linux-gnu/libphobos2.so.0.63
 #13 0xb768d8e1 in std.net.curl.HTTP._perform() () from
 /usr/lib/i386-linux-gnu/libphobos2.so.0.63
 #14 0xb768d734 in std.net.curl.HTTP.perform() () from
 /usr/lib/i386-linux-gnu/libphobos2.so.0.63
 #15 0x08081aac in
 _D3std3net4curl18__T10_basicHTTPTaZ10_basicHTTPFAxaAxvS3std3net4curl4HTTPZAa
 (client=..., sendData=579669917507256320,
     url=10576998119117946914) at /usr/include/dmd/phobos/std/net/curl.d:762
 #16 0x08081948 in
 _D3std3net4curl30__T3getTS3std3net4curl4HTTPTaZ3getFAxaS3std3net4curl4HTTPZAa
 (conn=..., url=10576998119117946914)
     at /usr/include/dmd/phobos/std/net/curl.d:364
-- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Oct 25 2013