www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Something goes wrong with range and sort?

reply "Andrea Fontana" <nospam example.com> writes:
	string[] test_filter(string[] words)
	{
		static blackList =
		[
			"d", "c", "e", "a", "è", "é", "e"
		].sort();
		
		return words.filter!((a) => 
!blackList.assumeSorted.contains(a)).array;
	}

	test_filter(["a", "b", "test", "hello"]).writeln;


This code crash! It's just a useless trimmed-down version of a 
more complex code, just to show you the bug.

If you remove accented letters from "blacklist" array it works 
fine. Why?
Jun 14 2013
next sibling parent reply "Aleksandar Ruzicic" <aleksandar ruzicic.info> writes:
On Friday, 14 June 2013 at 10:07:13 UTC, Andrea Fontana wrote:
 	string[] test_filter(string[] words)
 	{
 		static blackList =
 		[
 			"d", "c", "e", "a", "è", "é", "e"
 		].sort();
 		
 		return words.filter!((a) => 
 !blackList.assumeSorted.contains(a)).array;
 	}

 	test_filter(["a", "b", "test", "hello"]).writeln;


 This code crash! It's just a useless trimmed-down version of a 
 more complex code, just to show you the bug.

 If you remove accented letters from "blacklist" array it works 
 fine. Why?
sort() requires UTF32 input (a dstring/dchar[]) or it will fail. UTF8 (i.e. string type in D) is a variable-length encoding and if such input is given to sort it would refuse to compile since it cannot sort it in-place (a guarantee sort() makes). I guess that in case UTF8 input with 1-byte only characters is given to sort() it can detect that there are no variable length characters and that is probably the reason your code works when you remove non-ascii characters.
Jun 14 2013
parent reply "Peter Alexander" <peter.alexander.au gmail.com> writes:
On Friday, 14 June 2013 at 10:07:13 UTC, Andrea Fontana wrote:
 This code crash! It's just a useless trimmed-down version of a 
 more complex code, just to show you the bug.

 If you remove accented letters from "blacklist" array it works 
 fine. Why?
Works for me... http://dpaste.dzfl.pl/79db5a8b Can you produce a failing case on dpaste?
 sort() requires UTF32 input (a dstring/dchar[]) or it will fail.

 UTF8 (i.e. string type in D) is a variable-length encoding and 
 if such input is given to sort it would refuse to compile since 
 it cannot sort it in-place (a guarantee sort() makes).

 I guess that in case UTF8 input with 1-byte only characters is 
 given to sort() it can detect that there are no variable length 
 characters and that is probably the reason your code works when 
 you remove non-ascii characters.
sort works just fine with string[]. Note that he's not sorting a string, he's sorting an array of strings.
Jun 14 2013
next sibling parent reply "Aleksandar Ruzicic" <aleksandar ruzicic.info> writes:
On Friday, 14 June 2013 at 10:25:04 UTC, Peter Alexander wrote:
 sort works just fine with string[]. Note that he's not sorting 
 a string, he's sorting an array of strings.
Then I must have understood it wrong (or have confused sort() with some other function), gotta go read the docs again :)
Jun 14 2013
parent "monarch_dodra" <monarchdodra gmail.com> writes:
On Friday, 14 June 2013 at 10:30:18 UTC, Aleksandar Ruzicic wrote:
 On Friday, 14 June 2013 at 10:25:04 UTC, Peter Alexander wrote:
 sort works just fine with string[]. Note that he's not sorting 
 a string, he's sorting an array of strings.
Then I must have understood it wrong (or have confused sort() with some other function), gotta go read the docs again :)
You are mostly right, but there is a special overload of sort for narrow strings, that does it smart.
Jun 14 2013
prev sibling parent reply "Andrea Fontana" <nospam example.com> writes:
Just compile your example with -debug and assert will be hit!

On Friday, 14 June 2013 at 10:25:04 UTC, Peter Alexander wrote:
 On Friday, 14 June 2013 at 10:07:13 UTC, Andrea Fontana wrote:
 This code crash! It's just a useless trimmed-down version of a 
 more complex code, just to show you the bug.

 If you remove accented letters from "blacklist" array it works 
 fine. Why?
Works for me... http://dpaste.dzfl.pl/79db5a8b Can you produce a failing case on dpaste?
Jun 14 2013
parent reply "Peter Alexander" <peter.alexander.au gmail.com> writes:
On Friday, 14 June 2013 at 12:40:16 UTC, Andrea Fontana wrote:
 Just compile your example with -debug and assert will be hit!
Tried that, still cannot repro. As Brad said, what compiler version are you using?
Jun 14 2013
parent "Peter Alexander" <peter.alexander.au gmail.com> writes:
On Friday, 14 June 2013 at 17:58:36 UTC, Peter Alexander wrote:
 On Friday, 14 June 2013 at 12:40:16 UTC, Andrea Fontana wrote:
 Just compile your example with -debug and assert will be hit!
Tried that, still cannot repro. As Brad said, what compiler version are you using?
I tried with version 2.062 and I get an assert in -debug. If you are on that version or a previous version, you might want to try downloading the latest version.
Jun 14 2013
prev sibling parent "Brad Anderson" <eco gnuk.net> writes:
On Friday, 14 June 2013 at 10:07:13 UTC, Andrea Fontana wrote:
 	string[] test_filter(string[] words)
 	{
 		static blackList =
 		[
 			"d", "c", "e", "a", "è", "é", "e"
 		].sort();
 		
 		return words.filter!((a) => 
 !blackList.assumeSorted.contains(a)).array;
 	}

 	test_filter(["a", "b", "test", "hello"]).writeln;


 This code crash! It's just a useless trimmed-down version of a 
 more complex code, just to show you the bug.

 If you remove accented letters from "blacklist" array it works 
 fine. Why?
I'm not seeing a crash here: http://dpaste.dzfl.pl/59d5fb36 What version of the compiler/phobos are you using?
Jun 14 2013