www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - std.regex named matches

reply James Miller <james aatch.net> writes:
Hi,

I am using std.regex and using the named matches. I would like to be
able to get at the names that have matched, since this is library
code.

e.g.

    auto m = match("test/2", regex(r"(?P<word>\w+)/(?P<num>\d)"));
    //either
    auto names = m.names;
    //or
    auto names = m.captures.names;

or something similar. I've looked at the library and I can't find
anything of the sort, and you can't even use `foreach` to get at them
that way, I'm guessing because you can have both integer and string
indexes for the matches.

Thanks

James Miller
Feb 08 2012
parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
08.02.2012 13:07, James Miller пишет:
 Hi,

 I am using std.regex and using the named matches. I would like to be
 able to get at the names that have matched, since this is library
 code.

 e.g.

      auto m = match("test/2", regex(r"(?P<word>\w+)/(?P<num>\d)"));
      //either
      auto names = m.names;
      //or
      auto names = m.captures.names;

 or something similar. I've looked at the library and I can't find
 anything of the sort, and you can't even use `foreach` to get at them
 that way, I'm guessing because you can have both integer and string
 indexes for the matches.
I know this is two weeks old, but you can do foreach on them: foreach(c; m.captures){ //c is each captured group in turn, the first one is the whole match }
 Thanks

 James Miller
Feb 20 2012
parent reply James Miller <james aatch.net> writes:
On 20 February 2012 21:34, Dmitry Olshansky <dmitry.olsh gmail.com> wrote:
 08.02.2012 13:07, James Miller =D0=BF=D0=B8=D1=88=D0=B5=D1=82:
 Hi,

 I am using std.regex and using the named matches. I would like to be
 able to get at the names that have matched, since this is library
 code.

 e.g.

 =C2=A0 =C2=A0 auto m =3D match("test/2", regex(r"(?P<word>\w+)/(?P<num>\=
d)"));
 =C2=A0 =C2=A0 //either
 =C2=A0 =C2=A0 auto names =3D m.names;
 =C2=A0 =C2=A0 //or
 =C2=A0 =C2=A0 auto names =3D m.captures.names;

 or something similar. I've looked at the library and I can't find
 anything of the sort, and you can't even use `foreach` to get at them
 that way, I'm guessing because you can have both integer and string
 indexes for the matches.
I know this is two weeks old, but you can do foreach on them: foreach(c; m.captures){ =C2=A0 =C2=A0 =C2=A0 =C2=A0//c is each captured group in turn, the first =
one is the whole match
 }

 Thanks

 James Miller
Yeah, the problem with that is that I want the /names/ of the matches, that only returns the match. This was for library code, so the developer passes the regex. There are workarounds, but I would have liked to be able to do something more like auto names =3D m.captures.names; foreach (name; names) { writeln(name,": ", m.captures[name]); } or even a straight AA-style foreach like this: foreach (name, match; m.captures) { writeln(name,": ", match); } it was decided that more data was needed in regex anyway, but there was no consensus as to how that should be implemented.
Feb 20 2012
parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 21.02.2012 7:34, James Miller wrote:
 On 20 February 2012 21:34, Dmitry Olshansky<dmitry.olsh gmail.com>  wrote:
 08.02.2012 13:07, James Miller пишет:
 Hi,

 I am using std.regex and using the named matches. I would like to be
 able to get at the names that have matched, since this is library
 code.

 e.g.

      auto m = match("test/2", regex(r"(?P<word>\w+)/(?P<num>\d)"));
      //either
      auto names = m.names;
      //or
      auto names = m.captures.names;

 or something similar. I've looked at the library and I can't find
 anything of the sort, and you can't even use `foreach` to get at them
 that way, I'm guessing because you can have both integer and string
 indexes for the matches.
I know this is two weeks old, but you can do foreach on them: foreach(c; m.captures){ //c is each captured group in turn, the first one is the whole match }
 Thanks

 James Miller
Yeah, the problem with that is that I want the /names/ of the matches, that only returns the match. This was for library code, so the developer passes the regex. There are workarounds, but I would have liked to be able to do something more like auto names = m.captures.names; foreach (name; names) { writeln(name,": ", m.captures[name]); } or even a straight AA-style foreach like this: foreach (name, match; m.captures) { writeln(name,": ", match); }
Names work as aliases for numbers, so that not every captured group has name. But something along the lines of : foreach(num, match; m.captures) writeln(m.nameOf(num),": ",match); where nameOf(x) should return mm.. empty string for groups with no name?
 it was decided that more data was needed in regex anyway, but there
 was no consensus as to how that should be implemented.
Yes, more thought work needed. And being the guy behind current implementation, I'm curious what's your use case and how to best fit it in general API. -- Dmitry Olshansky
Feb 21 2012
parent James Miller <james aatch.net> writes:
On 22 February 2012 04:45, Dmitry Olshansky <dmitry.olsh gmail.com> wrote:
 On 21.02.2012 7:34, James Miller wrote:
 On 20 February 2012 21:34, Dmitry Olshansky<dmitry.olsh gmail.com> =C2=
=A0wrote:
 08.02.2012 13:07, James Miller =D0=BF=D0=B8=D1=88=D0=B5=D1=82:

 Hi,

 I am using std.regex and using the named matches. I would like to be
 able to get at the names that have matched, since this is library
 code.

 e.g.

 =C2=A0 =C2=A0 auto m =3D match("test/2", regex(r"(?P<word>\w+)/(?P<num=
\d)"));
 =C2=A0 =C2=A0 //either
 =C2=A0 =C2=A0 auto names =3D m.names;
 =C2=A0 =C2=A0 //or
 =C2=A0 =C2=A0 auto names =3D m.captures.names;

 or something similar. I've looked at the library and I can't find
 anything of the sort, and you can't even use `foreach` to get at them
 that way, I'm guessing because you can have both integer and string
 indexes for the matches.
I know this is two weeks old, but you can do foreach on them: foreach(c; m.captures){ =C2=A0 =C2=A0 =C2=A0 =C2=A0//c is each captured group in turn, the firs=
t one is the whole
 match
 }

 Thanks

 James Miller
Yeah, the problem with that is that I want the /names/ of the matches, that only returns the match. This was for library code, so the developer passes the regex. There are workarounds, but I would have liked to be able to do something more like =C2=A0 =C2=A0 auto names =3D m.captures.names; =C2=A0 =C2=A0 foreach (name; names) { =C2=A0 =C2=A0 =C2=A0 =C2=A0writeln(name,": ", m.captures[name]); =C2=A0 =C2=A0 } or even a straight AA-style foreach like this: =C2=A0 =C2=A0 foreach (name, match; m.captures) { =C2=A0 =C2=A0 =C2=A0 =C2=A0writeln(name,": ", match); =C2=A0 =C2=A0 }
Names work as aliases for numbers, so that not every captured group has name. But something along the lines of : foreach(num, match; m.captures) =C2=A0 =C2=A0 =C2=A0 =C2=A0writeln(m.nameOf(num),": ",match); where nameOf(x) should return mm.. empty string for groups with no name?
 it was decided that more data was needed in regex anyway, but there
 was no consensus as to how that should be implemented.
Yes, more thought work needed. And being the guy behind current implementation, I'm curious what's your use case and how to best fit it i=
n
 general API.

 --
 Dmitry Olshansky
Ah, right, nameOf would work ok. I know that the issue stems from there not really being a "best-solution" in this case because of the fact that you can have named /and/ unnamed matches. I don't /require/ it, it just would make my life easier. My use case, expanding on what I said before, is that I want to be able to present debugging information, being able to enumerate the names in the match goes a long way to that. Also it means I can do more flexible dispatching of data, otherwise I need hardcoded rules for the names, which is unfortunate. Like I said, nothing wrong with what is there, I just think that it can be improved. nameOf would work well, if it returns null, or empty string, then there is no name for that match. Since you can already grab specific names out of the regex anyway, this just fills in the gap in the other direction, going from match to name, rather than name to match. Thanks -- James Miller
Feb 21 2012