www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Case-insensitive BoyerMooreFinder

reply Ralf <ralf nowhere.com> writes:
Hi,

I just spend a day with the D programming language and I am very 
excited about being able to write such performant programs in 
such an clear and concise way!

I tried to write a little search tool.
The BoyerMooreFinder docs say that the comparison operator can be 
specified. Unfortunately, this doesn't work in practice:

import std.algorithm;
import std.stdio;

void main(string[] args) {
	writeln("> ", find("Hello World", boyerMooreFinder("World")));
	writeln("> ", find("Hello World", boyerMooreFinder!("toLower(a) 
== toLower(b)")("world")));
	writeln("> ", find("EE Hello World", 
boyerMooreFinder!("toLower(a) == toLower(b)")("worl")));
}

Result:

 World
 World
 
The problem is here: https://github.com/D-Programming-Language/phobos/blob/v2.069.1/std/algorithm/searching.d#L280 When it looks letters up in its private lookup tables, it uses the original letters, not the lower case ones. To work around this, I copied the algorithm class, customized this to use the .toLower letter here as well and it works like a charm. I wondered: Do I miss something here? Calling .toLower() on the whole string makes this a lot slower; is there a way to customize the algorithm with some mapping for the characters? Is there a use case for customizing the predicate only? Greetings, Ralf
Nov 21 2015
parent =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 11/21/2015 08:50 AM, Ralf wrote:

 The problem is here:
 https://github.com/D-Programming-Language/phobos/blob/v2.069.1/std/algorithm/searching.d#L280


 When it looks letters up in its private lookup tables, it uses the
 original letters, not the lower case ones. To work around this, I copied
 the algorithm class, customized this to use the .toLower letter here as
 well and it works like a charm.
Please bring this up on the main newsgroup. It sounds like an oversight to me, which may require an API change. Ali
Nov 23 2015