www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - compile-time regexp lib released

reply Marton Papp <anteusz freemail.hu> writes:
Hi!
There is an alternative way to match regular expressions.
Scregexp was released a month ago. (by me)
The regular expessions are converted into CTFE functions (a backtracking
top/down recursive descent parser) at compile time.
This should give faster regular expressions.
A lot of regular expression constructs are supported.
(?:),(),\d,\s,\w... Also, various switches such as i (case-insensitive
search),s, and x partly..
I consider it to be in beta state.

Regards

Marton Papp
Aug 07 2007
next sibling parent Marton Papp <anteusz freemail.hu> writes:
The name of library is scregexp and it can be downloaded at dsource.
It is based on Don Clugston's initial version regexp2.d
Aug 07 2007
prev sibling parent reply Robert Fraser <fraserofthenight gmail.com> writes:
Marton Papp Wrote:

 Hi!
 There is an alternative way to match regular expressions.
 Scregexp was released a month ago. (by me)
 The regular expessions are converted into CTFE functions (a backtracking
 top/down recursive descent parser) at compile time.
 This should give faster regular expressions.
 A lot of regular expression constructs are supported.
 (?:),(),\d,\s,\w... Also, various switches such as i (case-insensitive
 search),s, and x partly..
 I consider it to be in beta state.
 
 Regards
 
 Marton Papp
 
 
Nice work! I've always wanted to see compile-time regexes as string mixins, that way they could set Perl-like variables, i.e. __1 would be the first match, __2 the second, etc. Mind if I throw together a modification of your system to do that?
Aug 07 2007
parent reply Marton Papp <anteusz freemail.hu> writes:
Hi!

I do not mind. Except I do not understand your choice of __1, __2..
why not _1,_2,_3?
Or dollar1,dollar2,dollar3?
and would you fork it or would you like it to be added to scregexp?

Regards

Marton Papp
Aug 07 2007
parent reply Robert Fraser <fraserofthenight gmail.com> writes:
Marton Papp Wrote:

 Hi!
 
 I do not mind. Except I do not understand your choice of __1, __2..
 why not _1,_2,_3?
 Or dollar1,dollar2,dollar3?
 and would you fork it or would you like it to be added to scregexp?
 
 Regards
 
 Marton Papp
I was just going to extend it for my project when I eventually get around to it, probably not for another few months (I want to see Descent get as strong an IDE as JDT before I start splitting my attention). If you think it might be useful for scregexp, I don't think it'd be too hard to add. Thinking a bit more about it, an array might be better for that. _[0] would be the first capture, _[1] the second, etc. But that still doesn't allow something awesome like: if(str =~ m/(\d+)_(\w+)/) { print "$1 = $2"; } ... or how I'd like to see it in D: if(str.matches(`(\d+)_(\w+)`) { writefln("%d = %d", _[0], _[1]); } ... where matches() compiles the regex at compile time like scregex. Oh well, maybe AST macros will open up that possibility, though I'm not sure exactly how that's work.
Aug 07 2007
parent Marton Papp <anteusz freemail.hu> writes:
== Quote from Robert Fraser (fraserofthenight gmail.com)'s article
 Marton Papp Wrote:
 Hi!

 I do not mind. Except I do not understand your choice of __1, __2..
 why not _1,_2,_3?
 Or dollar1,dollar2,dollar3?
 and would you fork it or would you like it to be added to scregexp?

 Regards

 Marton Papp
I was just going to extend it for my project when I eventually get around to it,
probably not for another few months (I want to see Descent get as strong an IDE as JDT before I start splitting my attention). If you think it might be useful for scregexp, I don't think it'd be too hard to add.
 Thinking a bit more about it, an array might be better for that. _[0] would be
the first capture, _[1] the second, etc. But that still doesn't allow something awesome like:
 if(str =~ m/(\d+)_(\w+)/)
 {
     print "$1 = $2";
 }
 ... or how I'd like to see it in D:
 if(str.matches(`(\d+)_(\w+)`)
 {
     writefln("%d = %d", _[0], _[1]);
 }
 ... where matches() compiles the regex at compile time like scregex.
 Oh well, maybe AST macros will open up that possibility, though I'm not sure
exactly how that's work. Have a look how it is now! import scregexp; auto groups=indexgroups!(`(\d+)_(\w+)`)(str)); if (groups !is null) { writefln("%d = %d", group(str,groups,0), group(str,groups,1)); } I cannot see how this can be done in D at all if(str.matches(`(\d+)_(\w+)`)
 {
     writefln("%d = %d", _[0], _[1]);
 }
T Marton Papp
Aug 08 2007