www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - regex on binary data

reply "Darrell" <dgallion1 gmail.com> writes:
So far attempts to run regex on binary data causes
"Invalid UTF-8 sequence".

Attempts to pass ubyte also didn't work out.
Dec 31 2014
next sibling parent "Tobias Pankrath" <tobias pankrath.net> writes:
On Wednesday, 31 December 2014 at 15:36:19 UTC, Darrell wrote:
 So far attempts to run regex on binary data causes
 "Invalid UTF-8 sequence".

 Attempts to pass ubyte also didn't work out.
I doubt using anything except (d,w)string is supported or possible.
Dec 31 2014
prev sibling parent ketmar via Digitalmars-d-learn <digitalmars-d-learn puremagic.com> writes:
On Wed, 31 Dec 2014 15:36:16 +0000
Darrell via Digitalmars-d-learn <digitalmars-d-learn puremagic.com>
wrote:

 So far attempts to run regex on binary data causes
 "Invalid UTF-8 sequence".
=20
 Attempts to pass ubyte also didn't work out.
current regex engine assumes that you are using UTF-8 encoded text. i really want regex engine to support user-supplied input ranges instead, so decoding can be done by range (and regex engine can work on anything, not only on strings), but i'm not ready for that challenge yet. maybe i'll try to do something with it in 2015. ;-)
Dec 31 2014