www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 9390] New: Option for verbose regular expressions

reply d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9390

           Summary: Option for verbose regular expressions
           Product: D
           Version: D2
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: Phobos
        AssignedTo: nobody puremagic.com
        ReportedBy: bearophile_hugs eml.cc


--- Comment #0 from bearophile_hugs eml.cc 2013-01-24 18:14:01 PST ---
I'd really like an option to write "verbose" regular expressions in D, like in
Python:

http://docs.python.org/2/library/re.html


 re.X
 re.VERBOSE
 
     This flag allows you to write regular expressions that look
     nicer. Whitespace within the pattern is ignored, except when in a
     character class or preceded by an unescaped backslash, and, when
     a line contains a '#' neither in a character class or preceded by
     an unescaped backslash, all characters from the leftmost such '#'
     through the end of the line are ignored.
 
     That means that the two following regular expression objects that
     match a decimal number are functionally equal:
 
     a = re.compile(r"""\d +  # the integral part
                        \.    # the decimal point
                        \d *  # some fractional digits""", re.X)
     b = re.compile(r"\d+\.\d*")

RE code is code like every other, so it enjoys comments, a nicer indenting and formatting. Making RE more readable helps their debug and understand. In my Python code all RE longer than half a line of chars are "verbose". -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jan 24 2013
parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9390


Dmitry Olshansky <dmitry.olsh gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |dmitry.olsh gmail.com


--- Comment #1 from Dmitry Olshansky <dmitry.olsh gmail.com> 2013-01-25
12:13:45 PST ---
How about adding the common extensions that is called comments inside regular
expression.

I can't recall synatx off-hand but it's something like:
(?# some comment that is ignored)


Plus you can already use any of the follwoing:

auto pattern - r"the first piece" // comment
r"the second piece" //comment 2
...
r" the last piece"; //last comment


Or if implicit concatenation feels too dirty:

auto pattern - r"the first piece"  // comment
~ r"the second piece" //comment 2
...
~ r" the last piece"; //last comment

Either way free-form regex + top-level explanatory note is enough by my
standards. The rationale is if you have to explan every piece in isolation then
it's one of 2 cases: you are explaning machanics to people that don't know what
regex is (and it's wrong) or the regex pattern is too darn complex for its own
good.

Since this is enhancement request I hereby propose 2 ways to solve it: close as
won't fix or add the aformentioned extension for comments (that at least is
more or less common). I'm not going to add another option that messes with
syntax rules.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jan 25 2013