www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Reading .txt File into String and Matching with RegEx

reply BoQsc <vaidas.boqsc gmail.com> writes:
This is something I've searched on the forum and couldn't find 
exact answer.

TLDR: `r"^."` is matching the very first two character in the 
`input` string.

**matchtest.d**
```
import std.stdio : writeln;
import std.regex : matchAll;
import std.file  : read;

void main(){

	string input = cast(string)read("example.txt");
	writeln(matchAll(input, r"^."));

}
```

**Input(example.txt)**
```
HelloWorld
```

**Output**
```
rdmd matchtest.d
[["He"]]
```

https://dlang.org/phobos/std_regex.html#matchAll
https://dlang.org/library/std/file/read.html
Dec 10 2023
parent reply thinkunix <thinkunix zoho.com> writes:
BoQsc via Digitalmars-d-learn wrote:
 This is something I've searched on the forum and couldn't find exact 
 answer.
 
 TLDR: `r"^."` is matching the very first two character in the `input` 
 string.
Don't you need two dots to match two characters? Each dot being the regex to match a single character, so `r"^.."` instead of `r"^."` to get the first two characters. When I run your program (on linux with rdmd from DMD 2.106.0), I get: [["H"]]
Dec 10 2023
parent reply BoQsc <vaidas.boqsc gmail.com> writes:
On Monday, 11 December 2023 at 05:18:45 UTC, thinkunix wrote:
 BoQsc via Digitalmars-d-learn wrote:
 This is something I've searched on the forum and couldn't find 
 exact answer.
 
 TLDR: `r"^."` is matching the very first two character in the 
 `input` string.
Don't you need two dots to match two characters? Each dot being the regex to match a single character, so `r"^.."` instead of `r"^."` to get the first two characters. When I run your program (on linux with rdmd from DMD 2.106.0), I get: [["H"]]
Yeah, that's true, my mistake, forgot to update the snippet and note properly. Thanks! ``` import std.stdio : writeln; import std.regex : matchAll; import std.file : read; void main(){ string input = cast(string)read("example.txt"); writeln(matchAll(input, r"^..")); } ```
Dec 11 2023
parent BoQsc <vaidas.boqsc gmail.com> writes:
Matches function declarations and captures function names from 
`.d` Source Code file


**regexcapture.d**

```
import std.stdio : writeln;
import std.regex : matchAll, regex;
import std.file  : read;

void main(){
	string input = cast(string)read("sourcecode.d");

	foreach(match; matchAll(input, 
regex(r"\b([A-Za-z_]\w*)\s*\([^)]*\)\s*", "g"))){
		writeln(match.captures()[1]);
	}
}
```

**Input(sourcecode.d)**
```
     BOOL WaitNamedPipeA(LPCSTR, DWORD);
     BOOL WaitNamedPipeW(LPCWSTR, DWORD);
     BOOL WinLoadTrustProvider(GUID*);
     BOOL WriteFile(HANDLE, PCVOID, DWORD, PDWORD, LPOVERLAPPED);
     BOOL WriteFileEx(HANDLE, PCVOID, DWORD, LPOVERLAPPED, 
LPOVERLAPPED_COMPLETION_ROUTINE);
     BOOL WritePrivateProfileSectionA(LPCSTR, LPCSTR, LPCSTR);
     BOOL WritePrivateProfileSectionW(LPCWSTR, LPCWSTR, LPCWSTR);
     BOOL WritePrivateProfileStringA(LPCSTR, LPCSTR, LPCSTR, 
LPCSTR);
     BOOL WritePrivateProfileStringW(LPCWSTR, LPCWSTR, LPCWSTR, 
LPCWSTR);
```
Note: This small input excerpt was taken from a real source code 
file: 
https://github.com/dlang/dmd/blob/master/druntime/src/core/sys/windows/winbase.d#L2069-L2078

**Output**
```

C:\Users\Windows10\Documents\matchtest>rdmd regexcapture.d
WaitNamedPipeA
WaitNamedPipeW
WinLoadTrustProvider
WriteFile
WriteFileEx
WritePrivateProfileSectionA
WritePrivateProfileSectionW
WritePrivateProfileStringA
WritePrivateProfileStringW
```

---
Relevant links:
https://dlang.org/phobos/std_regex.html#regex
https://dlang.org/phobos/std_regex.html#.RegexMatch.captures
https://regexr.com/
Dec 11 2023