www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Bug in std.regexp with capturing subpatterns?

i don't know if this is really a bug, so i'm posting here in the learn 
Consider the following regular expression: "^((\\w)#)?(\\w+)$"
If there is no sharp in the string to search, the whole first subpattern 
won't match, so it captures nothing (an empty string). So the subpattern 
(\\w) shouldn't capture anything, either, right?

I tested it with the following code:
import std.stdio;
import std.regexp;

int main(char[][] args)
     auto m = std.regexp.search("test", "^((\\w)#)?(\\w+)$");
     if (m is null) { writefln("Doesn't match"); return 0; }
     for (int i=0;i<6;i++)
         writefln("%d: %s", i, m.match(i));
     return 0;
Which displays:
0: test
2: t
3: test

As you can see, subpattern #2 captures the first letter of the string.
I don't really know the regexp implementation, but i assume it doesn't 
throw away previously captured subpatterns after it encounters the 
mismatch with '#'.
Feb 25 2007