www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - failing regex

reply yawniek <dlang srtnwz.com> writes:
regex from
https://github.com/ua-parser/uap-core/blob/master/regexes.yaml#L38
seems to work in other languages, not so in D:

auto r2 = r"(?:\/[A-Za-z0-9\.]+)? *([A-Za-z0-9 
_\!\[\]:]*(?:[Aa]rchiver|[Ii]ndexer|[Ss]craper|[Bb]ot|[Ss]pi
er|[Cc]rawl[a-z]*)) (\d+)(?:\.(\d+)(?:\.(\d+))?)?".regex();

( https://gist.github.com/4334e35e68497c0517db )

results in

```
dmd -run failing_regex.d
std.regex.internal.ir.RegexException /usr/local/Cellar/dmd/2.069.0/include/d2/std/regex/inte
nal/parser.d(1392): invalid escape sequence
Pattern with error: `(?:\/[A-Za-z0-9\.]+)? *([A-Za-z0-9 _\!` 
<--HERE-- 
`\[\]:]*(?:[Aa]rchiver|[Ii]ndexer|[Ss]craper|[Bb]ot|[Ss]pider|[Cc]rawl[a-z]*))
(\d+)(?:\.(\d+)(?:\.(\d+))?)?`
----------------
4   dmd_run68HuB5                       0x00000001044d211d 
 trusted void 
std.regex.internal.parser.Parser!(immutable(char)[]).Parser.erro
(immutable(char)[]) + 297
5   dmd_run68HuB5                       0x00000001044da604 ref 
 trusted 
std.regex.internal.parser.Parser!(immutable(char)[]).Parser 
std.regex.internal.parser.Parser!(immutable(char)[]).Parser.__ctor!(const(char)[]).__cto
(immutable(char)[], const(char)[]) + 160
6   dmd_run68HuB5                       0x00000001044cc732  safe 
std.regex.internal.ir.Regex!(char).Regex 
std.regex.regexImpl!(immutable(char)[]).regexImpl(immutable(char)[],
const(char)[]) + 86
7   dmd_run68HuB5                       0x00000001044e944f 
std.regex.internal.ir.Regex!(char).Regex 
std.functional.__T7memoizeS95_D3std5regex18__T9regexImplTAyaZ9regexImplFNfAyaAxaZS3std5regex8internal2ir12__T5RegexTaZ5RegexVii8Z.memoiz
(immutable(char)[], const(char)[]) + 475
8   dmd_run68HuB5                       0x00000001044cc6bc 
 trusted std.regex.internal.ir.Regex!(char).Regex 
std.regex.regex!(immutable(char)[]).regex(immutable(char)[], 
const(char)[]) + 64
9   dmd_run68HuB5                       0x00000001044cc5de _Dmain 
+ 46
10  dmd_run68HuB5                       0x0000000104509ac3 
D2rt6dmain211_d_run_mainUiPPaPUAAaZiZ6runAllMFZ9__lambda1MFZv + 39
11  dmd_run68HuB5                       0x00000001045099fb void 
rt.dmain2._d_run_main(int, char**, extern (C) int 
function(char[][])*).tryExec(scope void delegate()) + 55
12  dmd_run68HuB5                       0x0000000104509a68 void 
rt.dmain2._d_run_main(int, char**, extern (C) int 
function(char[][])*).runAll() + 44
13  dmd_run68HuB5                       0x00000001045099fb void 
rt.dmain2._d_run_main(int, char**, extern (C) int 
function(char[][])*).tryExec(scope void delegate()) + 55
14  dmd_run68HuB5                       0x000000010450994d 
_d_run_main + 497
15  dmd_run68HuB5                       0x00000001044cc677 main + 
15
16  libdyld.dylib                       0x00007fff8e5185c8 start 
+ 0
17  ???                                 0x0000000000000000 0x0 + 0

```


bug or did i do something wrong?
Nov 23 2015
parent Rikki Cattermole <alphaglosined gmail.com> writes:
On 23/11/15 9:30 PM, yawniek wrote:
 regex from
 https://github.com/ua-parser/uap-core/blob/master/regexes.yaml#L38
 seems to work in other languages, not so in D:

 auto r2 = r"(?:\/[A-Za-z0-9\.]+)? *([A-Za-z0-9
 _\!\[\]:]*(?:[Aa]rchiver|[Ii]ndexer|[Ss]craper|[Bb]ot|[Ss]pider|[Cc]rawl[a-z]*))
 (\d+)(?:\.(\d+)(?:\.(\d+))?)?".regex();

 ( https://gist.github.com/4334e35e68497c0517db )

 results in

 ```
 dmd -run failing_regex.d
 std.regex.internal.ir.RegexException /usr/local/Cellar/dmd/2.069.0/include/d2/std/regex/internal/parser.d(1392):
 invalid escape sequence
 Pattern with error: `(?:\/[A-Za-z0-9\.]+)? *([A-Za-z0-9 _\!` <--HERE--
 `\[\]:]*(?:[Aa]rchiver|[Ii]ndexer|[Ss]craper|[Bb]ot|[Ss]pider|[Cc]rawl[a-z]*))
 (\d+)(?:\.(\d+)(?:\.(\d+))?)?`
 ----------------
 4   dmd_run68HuB5                       0x00000001044d211d  trusted void
 std.regex.internal.parser.Parser!(immutable(char)[]).Parser.error(immutable(char)[])
 + 297
 5   dmd_run68HuB5                       0x00000001044da604 ref  trusted
 std.regex.internal.parser.Parser!(immutable(char)[]).Parser
 std.regex.internal.parser.Parser!(immutable(char)[]).Parser.__ctor!(const(char)[]).__ctor(immutable(char)[],
 const(char)[]) + 160
 6   dmd_run68HuB5                       0x00000001044cc732  safe
 std.regex.internal.ir.Regex!(char).Regex
 std.regex.regexImpl!(immutable(char)[]).regexImpl(immutable(char)[],
 const(char)[]) + 86
 7   dmd_run68HuB5                       0x00000001044e944f
 std.regex.internal.ir.Regex!(char).Regex
 std.functional.__T7memoizeS95_D3std5regex18__T9regexImplTAyaZ9regexImplFNfAyaAxaZS3std5regex8internal2ir12__T5RegexTaZ5RegexVii8Z.memoize(immutable(char)[],
 const(char)[]) + 475
 8   dmd_run68HuB5                       0x00000001044cc6bc  trusted
 std.regex.internal.ir.Regex!(char).Regex
 std.regex.regex!(immutable(char)[]).regex(immutable(char)[],
 const(char)[]) + 64
 9   dmd_run68HuB5                       0x00000001044cc5de _Dmain + 46
 10  dmd_run68HuB5                       0x0000000104509ac3
 D2rt6dmain211_d_run_mainUiPPaPUAAaZiZ6runAllMFZ9__lambda1MFZv + 39
 11  dmd_run68HuB5                       0x00000001045099fb void
 rt.dmain2._d_run_main(int, char**, extern (C) int
 function(char[][])*).tryExec(scope void delegate()) + 55
 12  dmd_run68HuB5                       0x0000000104509a68 void
 rt.dmain2._d_run_main(int, char**, extern (C) int
 function(char[][])*).runAll() + 44
 13  dmd_run68HuB5                       0x00000001045099fb void
 rt.dmain2._d_run_main(int, char**, extern (C) int
 function(char[][])*).tryExec(scope void delegate()) + 55
 14  dmd_run68HuB5                       0x000000010450994d _d_run_main +
 497
 15  dmd_run68HuB5                       0x00000001044cc677 main + 15
 16  libdyld.dylib                       0x00007fff8e5185c8 start + 0
 17  ???                                 0x0000000000000000 0x0 + 0

 ```


 bug or did i do something wrong?
Its the: \! Don't escape !.
Nov 23 2015