www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 2108] New: regexp.d: The greedy dotstar isn't so greedy

reply d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=2108

           Summary: regexp.d: The greedy dotstar isn't so greedy
           Product: D
           Version: 2.012
          Platform: PC
        OS/Version: Linux
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Phobos
        AssignedTo: bugzilla digitalmars.com
        ReportedBy: nyphbl8d gmail.com


As far as I'm aware, ".*" should be greedy by default and become non-greedy
when changed to ".*?".  As it stands now, both ".*" and ".*?" are non-greedy
when it comes to std.regexp and I have found no way to make ".*" greedy, flags
or otherwise.  This can be seen by using
"<packet>text</packet><packet>text</packet>" as the buffer to match against and
"<packet.*/packet>" as the pattern.  When I use this with std.regexp.search, it
only matches the first opening and closing tag instead of the outer set.  I
just hope this isn't my lack of regex-fu coming back to haunt me.


-- 
May 14 2008
next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=2108


Andrei Alexandrescu <andrei metalanguage.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED
                 CC|                            |andrei metalanguage.com
         AssignedTo|nobody puremagic.com        |andrei metalanguage.com


-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Oct 11 2009
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=2108


David Simcha <dsimcha yahoo.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |swadenator gmail.com


--- Comment #1 from David Simcha <dsimcha yahoo.com> 2009-10-18 07:44:51 PDT ---
*** Issue 2487 has been marked as a duplicate of this issue. ***

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Oct 18 2009
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=2108


Jesse Phillips <Jesse.K.Phillips+D gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |Jesse.K.Phillips+D gmail.co
                   |                            |m
         OS/Version|Linux                       |All


--- Comment #2 from Jesse Phillips <Jesse.K.Phillips+D gmail.com> 2010-05-06
14:30:40 PDT ---
This is also an issue in Windows with std.regex using DMD 2.043

But I would like to add that it is always greedy prior to text. The first
assert will fail since it was not non-greedy and the second is what it should
be.

import std.regex;

void main() {
   assert(match("Hello there you silly person you.",
     regex(r"\b.+? you .+\w")).hit != "Hello there you silly");

   assert(match("Hello there you silly person you.",
     regex(r"\b.+? you .+\w")).hit == "there you silly person");
}

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
May 06 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=2108


Dmitry Olshansky <dmitry.olsh gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |dmitry.olsh gmail.com


--- Comment #3 from Dmitry Olshansky <dmitry.olsh gmail.com> 2011-04-18
13:54:37 PDT ---
(In reply to comment #2)
 This is also an issue in Windows with std.regex using DMD 2.043
 
 But I would like to add that it is always greedy prior to text. The first
 assert will fail since it was not non-greedy and the second is what it should
 be.
 
 import std.regex;
 
 void main() {
    assert(match("Hello there you silly person you.",
      regex(r"\b.+? you .+\w")).hit != "Hello there you silly");
 
    assert(match("Hello there you silly person you.",
      regex(r"\b.+? you .+\w")).hit == "there you silly person");
 }

Actually it should be assert(match("Hello there you silly person you.", regex(r"\b.+? you .+\w")).hit == "Hello there you silly person you"); Two points - \b also matches at the begining of input (if the first char is \w), and the last .+ is greedy, and since '.' is certainly not a \w, we have what we have. Also tested at: http://www.regextester.com/ http://www.regular-expressions.info/javascriptexample.html ... etc. P.S. The patch is coming ;) -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 18 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=2108


Andrei Alexandrescu <andrei metalanguage.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         AssignedTo|andrei metalanguage.com     |dmitry.olsh gmail.com


--- Comment #4 from Andrei Alexandrescu <andrei metalanguage.com> 2011-06-04
17:48:52 PDT ---
Reassigning to Dmitry.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jun 04 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=2108



--- Comment #5 from Dmitry Olshansky <dmitry.olsh gmail.com> 2011-06-05
00:09:28 PDT ---
I'd gladly close this issue, since it now works correctly in std.regex. But the
report is filed against std.regexP. Should I close it?

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jun 05 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=2108



--- Comment #6 from Andrei Alexandrescu <andrei metalanguage.com> 2011-06-05
06:18:54 PDT ---
Yes. Please also update the changelog.dd file.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jun 05 2011
prev sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=2108


Dmitry Olshansky <dmitry.olsh gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|                            |FIXED


--- Comment #7 from Dmitry Olshansky <dmitry.olsh gmail.com> 2011-06-06
08:02:43 PDT ---
Fixed for std.regex:
https://github.com/D-Programming-Language/phobos/commit/9afb00e36b625322d7f1d8ec0fbd876c2b5c03fc

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jun 06 2011