www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Re: Fuzzy string matching?

reply dsmith <ds nomail.com> writes:
Until recently, you could easily use std.regexp.search(target_string,
find_string), but regexp is apparently no longer in phobos.  I seek a simple
substitute.  std.algorithm.canFind might work, as
it is bool.

Maybe try something like:

foreach(str; strings)
    foreach(fls; system_files)
        if(std.algorithm.canFind(fls, str))          // usage needs verification
            str ~= ".ext";


== Repost the article of Jonathan M Davis (jmdavisProg gmx.com)
== Posted at 2011/07/15 22:03 to digitalmars.D.learn

On Saturday 16 July 2011 01:17:36 Andrej Mitrovic wrote:
 Is there any such method in Phobos?

 I have to rename some files based on a string array of known names
 which need to be fuzzy-matched to file names and then rename the files
 to the matches.

 E.g.:

 string[] strings = ["food", "lamborghini", "architecture"]

 files on system:
 .\foo.ext
 .\lmbrghinione.ext
 .\archtwo.ext

 and if there's a fuzzy match then the matched files would be renamed to:
 .\food.ext
 .\lamborghini.ext
 .\architecture.ext

 Perhaps there's a C library I can use for this?

You can pass a comparator function to cmp to change how comparison is done, but it's by character, so it'll only work in the case where the number of characters is identical. Other than that, I'd be tempted to say that there must be a function in std.range or std.algorithm that you could get to do it, but I'd have to go over the list and really think about it. The fact that you're effectively comparing the whole range at once instead of just characters makes that a lot harder though. - Jonathan M Davis
Jul 15 2011
parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Saturday 16 July 2011 05:07:38 dsmith wrote:
 Until recently, you could easily use std.regexp.search(target_string,
 find_string), but regexp is apparently no longer in phobos.  I seek a
 simple substitute.  std.algorithm.canFind might work, as it is bool.
 
 Maybe try something like:
 
 foreach(str; strings)
     foreach(fls; system_files)
         if(std.algorithm.canFind(fls, str))          // usage needs
 verification str ~= ".ext";

std.regex is std.regexp's replacement. - Jonathan M davis
Jul 15 2011