digitalmars.D - The rfind challenge

Andrei Alexandrescu (37/37) Jan 14 2013 (Warning: this assumes a fair bit of background with ranges and the STL....

Andrei Alexandrescu (3/4) Jan 14 2013 s/C=+/C++/

Mehrdad (3/3) Jan 15 2013 Solution: add a "subtraction" operator/method, which, when given

H. S. Teoh (17/32) Jan 14 2013 [...]

Andrei Alexandrescu (4/14) Jan 15 2013 That works, but returns a different type. Ideally rfind would return the...

H. S. Teoh (8/23) Jan 15 2013 [...]
deadalnix (4/21) Jan 15 2013 Retro of retro is supposed to give back the original range, so I

Andrei Alexandrescu (3/23) Jan 15 2013 There's a takeUntil in the middle.

monarch_dodra (54/83) Jan 14 2013 I'd hardly call that an ideal solution >:( The point of rfind is

Andrei Alexandrescu (6/18) Jan 15 2013 I may reply to the rest, but let me destroy this quickly: this is the

monarch_dodra (10/32) Jan 15 2013 I'd argue against providing rfind at all if R is not

Andrei Alexandrescu (3/5) Jan 15 2013 I think cutBefore is sufficient. No? Ideas for better names?

monarch_dodra (21/26) Jan 15 2013 I've been trying to brainstorm a couple of ideas. Here is one:

FG (16/26) Jan 15 2013 That example creates an infinite loop. Needs a popFront:

Andrei Alexandrescu (3/30) Jan 15 2013 Thanks for the fixes!
Timon Gehr (75/102) Jan 15 2013 Yup.

Timon Gehr (13/24) Jan 15 2013 Obviously, this should be
FG (2/3) Jan 15 2013 Thanks for adding the necessary check for element being at position 0.

Phil Lavoie (27/68) Jan 15 2013 Interesting problem. One possibility would be to add a CLASS

Andrei Alexandrescu (6/24) Jan 15 2013 This would be difficult to implement safely, which is an important

Phil Lavoie (14/49) Jan 15 2013 Ok then, imagine we use @reverse, @reversed and @merge primitives:

Phil Lavoie (2/3) Jan 15 2013 And correcting again, invert both to:
Andrei Alexandrescu (3/4) Jan 15 2013 That's too many. Simpler approaches?

FG (9/10) Jan 15 2013 Let me give an outside perspective of someone that doesn't work with D a...

monarch_dodra (30/43) Jan 15 2013 Well, you just stumbled on the entire problem.

FG (5/10) Jan 15 2013 Ah, right. That Take has bitten me before when I was discussing string s...

Phil Lavoie (44/44) Jan 15 2013 Ok, so to sum it up:

Phil Lavoie (2/3) Jan 15 2013 This would have to be corrected.

Phil Lavoie (21/22) Jan 15 2013 Not true. It should default to the genuine bidirectional range's
H. S. Teoh (16/24) Jan 15 2013 [...]

Phil Lavoie (4/34) Jan 15 2013 +1

Phil Lavoie (5/5) Jan 15 2013 And, if ever needed, popBack could become this:

Phil Lavoie (30/30) Jan 15 2013 Continuing with reversible ranges:

Phil Lavoie (8/8) Jan 15 2013 It'd be nicer if we could use an operator, but this is wishful

Phil Lavoie (4/12) Jan 15 2013 What about startOn instead?

Andrei Alexandrescu (31/32) Jan 15 2013 I don't think .reverse will take us far enough. Won't work with arrays

H. S. Teoh (27/32) Jan 15 2013 [...]

Andrei Alexandrescu (4/13) Jan 15 2013 Yah, there's retro already. My point is .reverse is not powerful enough....
monarch_dodra (9/42) Jan 15 2013 I'm having trouble understanding what you guys are going on

monarch_dodra (14/46) Jan 15 2013 But the goal was having a range that start at needle, and ends at

Andrei Alexandrescu (6/10) Jan 15 2013 Sorry, got ahead of myself. r.retro.before would be needed, which

deadalnix (4/18) Jan 15 2013 I think the whole point of reverse is to avoid duplicating

FG (4/18) Jan 15 2013 I'm scared and would feel safer with:

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

(Warning: this assumes a fair bit of background with ranges and the STL.)

We've had a long-standing weakness of ranges when compared with STL 
iterators: in a find() call we get to access the balance of the range 
from the found position (if any) through the end of the initial range. 
In contrast, STL's find returns an iterator that can be combined with 
the beginning of the initial range or the end of the initial range, thus 
offering two ranges instead of one.

D offers the findSplit family which partially deals with this, but in an 
arguably imperfect way because the range from the beginning of the 
initial range to the found element has a different type than the initial 
range.

In spite of that, things have worked out well.

Nowadays I've been thinking (inspired by an exchange on a C=+ mailing 
list) that STL's rfind (http://goo.gl/Btg8s) is a real challenge for D's 
ranges.

In D, r.rfind(e) should return a range starting with the last occurrence 
of e in r and spanning through the rest of r.

If r is a forward range but not better, this is rather simple:

R rfind(R, E)(R range, E element)
{
     for (;;)
     {
         auto ahead = range.save.find(element);
         if (ahead.empty) return range;
         range = ahead;
     }
}

If the range is random-access, the algorithm is easy to implement 
efficiently by scanning from the end and using indexing.

The tricky case is with bidirectional range. I was unable to define an 
algorithm using the current primitives that: (a) scans the range from 
the end, and (b) returns a range from the found position through the end.

Looks like rfind() would need one extra primitive for bidirectional 
ranges. What would be a good one? I have a few starters, but don't want 
to bias anyone. Please chime in!


Thanks,

Andrei

Jan 14 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 1/15/13 12:51 AM, Andrei Alexandrescu wrote:
 Nowadays I've been thinking (inspired by an exchange on a C=+ mailing

s/C=+/C++/

Andrei

Jan 14 2013

"Mehrdad" <wfunction hotmail.com> writes:

Solution: add a "subtraction" operator/method, which, when given 
a sub-range of a super-range, returns the portion of the 
super-range before and after the sub-range.

Jan 15 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Tue, Jan 15, 2013 at 12:51:16AM -0500, Andrei Alexandrescu wrote:
[...]
 Nowadays I've been thinking (inspired by an exchange on a C=+
 mailing list) that STL's rfind (http://goo.gl/Btg8s) is a real
 challenge for D's ranges.
 
 In D, r.rfind(e) should return a range starting with the last
 occurrence of e in r and spanning through the rest of r.
 
 If r is a forward range but not better, this is rather simple:

[...]
 If the range is random-access, the algorithm is easy to implement
 efficiently by scanning from the end and using indexing.
 
 The tricky case is with bidirectional range. I was unable to define
 an algorithm using the current primitives that: (a) scans the range
 from the end, and (b) returns a range from the found position
 through the end.

Hmm. What about introducing a new primitive takeUntil, that returns the
initial segment of the range until a given predicate becomes true? Then
you could implement rfind thus:

	auto rfind(alias pred, R)(R range)
		if (isBidirectionalRange!R)
	{
		return range.retro()
			.takeUntil!pred()
			.retro();
	}


T

-- 
Answer: Because it breaks the logical sequence of discussion. /
Question: Why is top posting bad?

Jan 14 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 1/15/13 1:31 AM, H. S. Teoh wrote:
 Hmm. What about introducing a new primitive takeUntil, that returns the
 initial segment of the range until a given predicate becomes true? Then
 you could implement rfind thus:

 	auto rfind(alias pred, R)(R range)
 		if (isBidirectionalRange!R)
 	{
 		return range.retro()
 			.takeUntil!pred()
 			.retro();
 	}

That works, but returns a different type. Ideally rfind would return the 
same type as the initial range, R.

Andrei

Jan 15 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Tue, Jan 15, 2013 at 07:54:12AM -0500, Andrei Alexandrescu wrote:
 On 1/15/13 1:31 AM, H. S. Teoh wrote:
Hmm. What about introducing a new primitive takeUntil, that returns the
initial segment of the range until a given predicate becomes true? Then
you could implement rfind thus:

	auto rfind(alias pred, R)(R range)
		if (isBidirectionalRange!R)
	{
		return range.retro()
			.takeUntil!pred()
			.retro();
	}

 
 That works, but returns a different type. Ideally rfind would return
 the same type as the initial range, R.

[...]

Hmm. Then it is indeed impossible with the current primitives, because
to iterate from the back, one would have to pop off the back, so by
definition you can't reach the back anymore.


T

-- 
Государство делает вид, что платит нам
зарплату, а мы делаем вид, что работаем.

Jan 15 2013

"deadalnix" <deadalnix gmail.com> writes:

On Tuesday, 15 January 2013 at 12:54:12 UTC, Andrei Alexandrescu 
wrote:
 On 1/15/13 1:31 AM, H. S. Teoh wrote:
 Hmm. What about introducing a new primitive takeUntil, that 
 returns the
 initial segment of the range until a given predicate becomes 
 true? Then
 you could implement rfind thus:

 	auto rfind(alias pred, R)(R range)
 		if (isBidirectionalRange!R)
 	{
 		return range.retro()
 			.takeUntil!pred()
 			.retro();
 	}

 That works, but returns a different type. Ideally rfind would 
 return the same type as the initial range, R.

 Andrei

Retro of retro is supposed to give back the original range, so I 
don't think it does.

Jan 15 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 1/15/13 9:16 PM, deadalnix wrote:
 On Tuesday, 15 January 2013 at 12:54:12 UTC, Andrei Alexandrescu wrote:
 On 1/15/13 1:31 AM, H. S. Teoh wrote:
 Hmm. What about introducing a new primitive takeUntil, that returns the
 initial segment of the range until a given predicate becomes true? Then
 you could implement rfind thus:

 auto rfind(alias pred, R)(R range)
 if (isBidirectionalRange!R)
 {
 return range.retro()
 .takeUntil!pred()
 .retro();
 }

 That works, but returns a different type. Ideally rfind would return
 the same type as the initial range, R.

 Andrei

 Retro of retro is supposed to give back the original range, so I don't
 think it does.

There's a takeUntil in the middle.

Andrei

Jan 15 2013

"monarch_dodra" <monarchdodra gmail.com> writes:

On Tuesday, 15 January 2013 at 05:51:16 UTC, Andrei Alexandrescu 
wrote:
 (Warning: this assumes a fair bit of background with ranges and 
 the STL.)

 We've had a long-standing weakness of ranges when compared with 
 STL iterators: in a find() call we get to access the balance of 
 the range from the found position (if any) through the end of 
 the initial range. In contrast, STL's find returns an iterator 
 that can be combined with the beginning of the initial range or 
 the end of the initial range, thus offering two ranges instead 
 of one.

 D offers the findSplit family which partially deals with this, 
 but in an arguably imperfect way because the range from the 
 beginning of the initial range to the found element has a 
 different type than the initial range.

 In spite of that, things have worked out well.

 Nowadays I've been thinking (inspired by an exchange on a C=+ 
 mailing list) that STL's rfind (http://goo.gl/Btg8s) is a real 
 challenge for D's ranges.

 In D, r.rfind(e) should return a range starting with the last 
 occurrence of e in r and spanning through the rest of r.

 If r is a forward range but not better, this is rather simple:

 R rfind(R, E)(R range, E element)
 {
     for (;;)
     {
         auto ahead = range.save.find(element);
         if (ahead.empty) return range;
         range = ahead;
     }
 }

I'd hardly call that an ideal solution >:( The point of rfind is 
to not start the search from the start of the string. What if 
your input was:
//----
string mobyDick = ...:
auto r = rfind(mobyDick, ishmael); This may take a while...
//----

That, and it would fail with rfind("A_bc_bc_bc_A", "bc_bc"): It 
would find "bc_bc_bc_A" instead of "bc_bc_A".

The root cause is that a bidirectional range just can't be 
"cut"/"sliced" at a given point, whereas iterators allow this.

As much as I love ranges, I think this is a fundamental flaw for 
anything less that hasSlicing.

So far, the problem has stayed under the radar, thanks (because?) 
of take, but the problems it brings are clear:
-Bias towards front iteration.
-Type system warping.
-Loss of bidirectionality when taken.

Algorithms suffer because of this. Containers suffer because of 
this.

I don't have any real solution to offer (there'd be a real 
complexity cost to any proposal), but I think there *needs* to be 
a way to provide "inter" slicing of ranges. Something along the 
lines of:
//----
bidir a = [1, 2, 3, 4, 5, 6];
bidir b = a.drop(2).dropBack(2); //[3, 4];
auto l = cutBefore(a, b); //Gets the part of a that is before b
auto r = cutAfter(a, b); //Gets the part of a that is after b
static assert(typeof(a) == typeof(l));
static assert(typeof(a) == typeof(r));
assert(equal(l), [1, 2]);
assert(equal(r), [5, 6]);
//----

The advantage here is that:
1) consistent typing.
2) no direction bias.

If we had this, then find could just return the slice containing 
the found element, and user could work his way from there.

Problem is bloating the interface ... That, and we'd also need 
the versions that keep the cut element...
//----
bidir a = [1, 2, 3, 4, 5, 6];
bidir b = a.drop(2).dropBack(2); //[3, 4];
auto l = mergeBefore(a, b); //Gets the part of a that is before 
b, with b
auto r = mergeAfter(a, b); //Gets the part of a that is after b, 
with b
assert(equal(l), [1, 2, 3, 4]);
assert(equal(r), [3, 4, 5, 6]);
//----

It is a complicated problem, but one that is worth solving, IMO.

Jan 14 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 1/15/13 2:20 AM, monarch_dodra wrote:
 On Tuesday, 15 January 2013 at 05:51:16 UTC, Andrei Alexandrescu wrote:
 R rfind(R, E)(R range, E element)
 {
 for (;;)
 {
 auto ahead = range.save.find(element);
 if (ahead.empty) return range;
 range = ahead;
 }
 }

 I'd hardly call that an ideal solution >:( The point of rfind is to not
 start the search from the start of the string.

I may reply to the rest, but let me destroy this quickly: this is the 
implementation for a forward range that's not bidirectional. The 
challenge is indeed implementing rfind for bidirectional ranges that are 
not random (e.g. strings).


Andrei

Jan 15 2013

"monarch_dodra" <monarchdodra gmail.com> writes:

On Tuesday, 15 January 2013 at 12:56:43 UTC, Andrei Alexandrescu 
wrote:
 On 1/15/13 2:20 AM, monarch_dodra wrote:
 On Tuesday, 15 January 2013 at 05:51:16 UTC, Andrei 
 Alexandrescu wrote:
 R rfind(R, E)(R range, E element)
 {
 for (;;)
 {
 auto ahead = range.save.find(element);
 if (ahead.empty) return range;
 range = ahead;
 }
 }

 I'd hardly call that an ideal solution >:( The point of rfind 
 is to not
 start the search from the start of the string.

 I may reply to the rest, but let me destroy this quickly: this 
 is the implementation for a forward range that's not 
 bidirectional.

I'd argue against providing rfind at all if R is not 
bidirectional. But that's not the thread's main issue, so let's 
not focus on this.

 The challenge is indeed implementing rfind for bidirectional 
 ranges that are not random (e.g. strings).

 Andrei

Agreed.

On a side note, it would be very easy if we agreed to return the 
subrange starting at the beginning of range, and ending after the 
last element.

But that's really just avoiding the problem.

Jan 15 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 1/15/13 2:20 AM, monarch_dodra wrote:
 auto l = cutBefore(a, b); //Gets the part of a that is before b
 auto r = cutAfter(a, b); //Gets the part of a that is after b

I think cutBefore is sufficient. No? Ideas for better names?

Andrei

Jan 15 2013

"monarch_dodra" <monarchdodra gmail.com> writes:

On Tuesday, 15 January 2013 at 17:57:07 UTC, Andrei Alexandrescu 
wrote:
 On 1/15/13 2:20 AM, monarch_dodra wrote:
 auto l = cutBefore(a, b); //Gets the part of a that is before b
 auto r = cutAfter(a, b); //Gets the part of a that is after b

 I think cutBefore is sufficient. No? Ideas for better names?

 Andrei

I've been trying to brainstorm a couple of ideas. Here is one:

//----
Tuple(R, R) cut(R other); //Returns both parts of the
                           //current range that are
                           //Before and After other

usage:
//----
Tuple(R, R) slices = a.cut(b); //Cuts a in two pieces, relative 
to b.
assert(equal(a, chain(slices[0], b, slices[1])));

 From there, we adding just the "merge" primitive would give us:

//----
R result = merge(b, slices[1]);

The neat thing here is that we have consistent typing the entire 
algorithm through. Yay!

That's two extra functions, which can provide slicing and merging 
for bidirectional ranges. And they can be provided without 
breaking anything existing. Not too shabby (IMO).

This is still very sketchy of course, so don't destroy too hard ;)

Jan 15 2013

FG <home fgda.pl> writes:

On 2013-01-15 06:51, Andrei Alexandrescu wrote:
 If r is a forward range but not better, this is rather simple:

 R rfind(R, E)(R range, E element)
 {
      for (;;)
      {
          auto ahead = range.save.find(element);
          if (ahead.empty) return range;
          range = ahead;
      }
 }

That example creates an infinite loop. Needs a popFront:

     R rfind(R, E)(R range, E element)
     {
         if (range.empty) return range;
         for (;;)
         {
             auto ahead = range.save;
             ahead.popFront();
             ahead = ahead.find(element);
             if (ahead.empty) return range;
             range = ahead;
         }
     }

Another thing: if the element is not found, I think an empty range should be 
returned and not the whole range.

Jan 15 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 1/15/13 6:53 AM, FG wrote:
 On 2013-01-15 06:51, Andrei Alexandrescu wrote:
 If r is a forward range but not better, this is rather simple:

 R rfind(R, E)(R range, E element)
 {
 for (;;)
 {
 auto ahead = range.save.find(element);
 if (ahead.empty) return range;
 range = ahead;
 }
 }

 That example creates an infinite loop. Needs a popFront:

 R rfind(R, E)(R range, E element)
 {
 if (range.empty) return range;
 for (;;)
 {
 auto ahead = range.save;
 ahead.popFront();
 ahead = ahead.find(element);
 if (ahead.empty) return range;
 range = ahead;
 }
 }

 Another thing: if the element is not found, I think an empty range
 should be returned and not the whole range.

Thanks for the fixes!

Andrei

Jan 15 2013

Timon Gehr <timon.gehr gmx.ch> writes:

On 01/15/2013 12:53 PM, FG wrote:
 On 2013-01-15 06:51, Andrei Alexandrescu wrote:
 If r is a forward range but not better, this is rather simple:

 R rfind(R, E)(R range, E element)
 {
      for (;;)
      {
          auto ahead = range.save.find(element);
          if (ahead.empty) return range;
          range = ahead;
      }
 }

 That example creates an infinite loop. Needs a popFront:

      R rfind(R, E)(R range, E element)
      {
          if (range.empty) return range;
          for (;;)
          {
              auto ahead = range.save;
              ahead.popFront();
              ahead = ahead.find(element);
              if (ahead.empty) return range;
              range = ahead;
          }
      }

That is buggy too.

 Another thing: if the element is not found, I think an empty range
 should be returned and not the whole range.

Yup.

R rfind(R, E)(R range,E element){
     auto r=range.find(element);
     if(r.empty) return r;
     for(;;){
         auto ahead=r.save;
         ahead.popFront();
         ahead=ahead.find(element);
         if(ahead.empty) return r;
         r=ahead;
     }
}


Full proof (assuming purity of the range primitives):

/+pure+/ R tail(R)(R r)out(result){
     assert(result=={
         auto a=r;
         a.popFront();
         return a;
     }());
}body{
     auto a=r;
     a.popFront();
     return a;
}

/+pure+/ R rfind(R, E)(R range,E element)out(result){
	assert(range.find(element).empty&&result.empty||!range.find(element).empty&&!result.empty&&result.front==element&&result.tail.find(element).empty);
}body{
     auto r=range.find(element);
     assert(r==range.find(element)&&(r.empty||r.front==element));
     if(r.empty){
         assert(r==range.find(element)&&r.empty);
         return r;
         /+assert(result==range.find(element)&&result.empty);
         assert(range.find(element).empty&&result.empty);+/
     }
     assert(r==range.find(element)&&!r.empty&&r.front==element);
     assert(!range.find(element).empty&&!r.empty&&r.front==element);
     for(;;){
 
assert(true&&!range.find(element).empty&&!r.empty&&r.front==element);
 
assert(!range.find(element).empty&&!r.empty&&r.front==element&&!r.empty);
         auto ahead=r.save;
 
assert(!range.find(element).empty&&!r.empty&&r.front==element&&!ahead.empty&&ahead==r.save);
         ahead.popFront();
 
assert(!range.find(element).empty&&!r.empty&&r.front==element&&ahead=={
             auto ahead=r.save;
             ahead.popFront();
             return ahead;
         }());
 
assert(!range.find(element).empty&&!r.empty&&r.front==element&&ahead==r.tail);
         ahead=ahead.find(element);
 
assert(!range.find(element).empty&&!r.empty&&r.front==element&&(ahead.empty||ahead.front==element)&&ahead==r.tail.find(element));
         if(ahead.empty){
 
assert(!range.find(element).empty&&ahead.empty&&!r.empty&&r.front==element&&ahead==r.tail.find(element));
             return r;
 
/+assert(!range.find(element).empty&&ahead.empty&&!result.empty&&result.front==element&&ahead==result.tail.find(element));
 
assert(!range.find(element).empty&&!result.empty&&result.front==element&&result.tail.find(element).empty);+/
         }
 
assert(!range.find(element).empty&&!ahead.empty&&ahead.front==element);
         r=ahead;
         assert(!range.find(element).empty&&!r.empty&&r.front==element);
     }
     assert(false);
}

Jan 15 2013

Timon Gehr <timon.gehr gmx.ch> writes:

On 01/15/2013 02:30 PM, Timon Gehr wrote:
 /+pure+/ R tail(R)(R r)out(result){
      assert(result=={
          auto a=r;
          a.popFront();
          return a;
      }());
 }body{
      auto a=r;
      a.popFront();
      return a;
 }

Obviously, this should be

/+pure+/ R tail(R)(R r)out(result){
     assert(result=={
         auto a=r.save;
         a.popFront();
         return a;
     }());
}body{
     auto a=r.save;
     a.popFront();
     return a;
}

Jan 15 2013

FG <home fgda.pl> writes:

On 2013-01-15 14:30, Timon Gehr wrote:
 That is buggy too.

Thanks for adding the necessary check for element being at position 0.

Jan 15 2013

"Phil Lavoie" <maidenphil hotmail.com> writes:

On Tuesday, 15 January 2013 at 05:51:16 UTC, Andrei Alexandrescu 
wrote:
 (Warning: this assumes a fair bit of background with ranges and 
 the STL.)

 We've had a long-standing weakness of ranges when compared with 
 STL iterators: in a find() call we get to access the balance of 
 the range from the found position (if any) through the end of 
 the initial range. In contrast, STL's find returns an iterator 
 that can be combined with the beginning of the initial range or 
 the end of the initial range, thus offering two ranges instead 
 of one.

 D offers the findSplit family which partially deals with this, 
 but in an arguably imperfect way because the range from the 
 beginning of the initial range to the found element has a 
 different type than the initial range.

 In spite of that, things have worked out well.

 Nowadays I've been thinking (inspired by an exchange on a C=+ 
 mailing list) that STL's rfind (http://goo.gl/Btg8s) is a real 
 challenge for D's ranges.

 In D, r.rfind(e) should return a range starting with the last 
 occurrence of e in r and spanning through the rest of r.

 If r is a forward range but not better, this is rather simple:

 R rfind(R, E)(R range, E element)
 {
     for (;;)
     {
         auto ahead = range.save.find(element);
         if (ahead.empty) return range;
         range = ahead;
     }
 }

 If the range is random-access, the algorithm is easy to 
 implement efficiently by scanning from the end and using 
 indexing.

 The tricky case is with bidirectional range. I was unable to 
 define an algorithm using the current primitives that: (a) 
 scans the range from the end, and (b) returns a range from the 
 found position through the end.

 Looks like rfind() would need one extra primitive for 
 bidirectional ranges. What would be a good one? I have a few 
 starters, but don't want to bias anyone. Please chime in!


 Thanks,

 Andrei

Interesting problem. One possibility would be to add a CLASS 
primitive like this: Range.merge( Range start, Range end ) 
//Takes the start of the start and the end of end.

Pseudo:
   original = range.save
   resPos = range.retro.find
   return Range.merge( resPos, original ) //DOES NOT WORK

The problem with this solution is that:
Retro returns a different range type and it would be more 
complicated to
have the merge function work with this type.

An addition to this solution could be another new primitive: 
reverse:

Pseudo:
   original = range.save
   reversed = range.reverse //Permanently invert start an end. I 
think this is feasible for all bidirectional ranges. Still a 
bidirectional range.
   reversed.find( needle ) //In haystack :)
   return Range.merge( reversed, original ) //Ok, start of 
reversed is on needle and end of original hasn't moved. The order 
of the range returned is the same as the one received.

This is just a suggestion and any primitive could be renamed. 
However, I think reverse is a good place to start.

Phil

Jan 15 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 1/15/13 12:03 PM, Phil Lavoie wrote:
 Pseudo:
 original = range.save
 resPos = range.retro.find
 return Range.merge( resPos, original ) //DOES NOT WORK

 The problem with this solution is that:
 Retro returns a different range type and it would be more complicated to
 have the merge function work with this type.

This would be difficult to implement safely, which is an important 
disadvantage.

 An addition to this solution could be another new primitive: reverse:

 Pseudo:
 original = range.save
 reversed = range.reverse //Permanently invert start an end. I think this
 is feasible for all bidirectional ranges. Still a bidirectional range.
 reversed.find( needle ) //In haystack :)
 return Range.merge( reversed, original ) //Ok, start of reversed is on
 needle and end of original hasn't moved. The order of the range returned
 is the same as the one received.

 This is just a suggestion and any primitive could be renamed. However, I
 think reverse is a good place to start.

Reverse is really cool and immediate to implement safely for a lot of 
ranges I can think of. How would you define rfind assuming reverse?


Andrei

Jan 15 2013

"Phil Lavoie" <maidenphil hotmail.com> writes:

On Tuesday, 15 January 2013 at 17:14:32 UTC, Andrei Alexandrescu 
wrote:
 On 1/15/13 12:03 PM, Phil Lavoie wrote:
 Pseudo:
 original = range.save
 resPos = range.retro.find
 return Range.merge( resPos, original ) //DOES NOT WORK

 The problem with this solution is that:
 Retro returns a different range type and it would be more 
 complicated to
 have the merge function work with this type.

 This would be difficult to implement safely, which is an 
 important disadvantage.

 An addition to this solution could be another new primitive: 
 reverse:

 Pseudo:
 original = range.save
 reversed = range.reverse //Permanently invert start an end. I 
 think this
 is feasible for all bidirectional ranges. Still a 
 bidirectional range.
 reversed.find( needle ) //In haystack :)
 return Range.merge( reversed, original ) //Ok, start of 
 reversed is on
 needle and end of original hasn't moved. The order of the 
 range returned
 is the same as the one received.

 This is just a suggestion and any primitive could be renamed. 
 However, I
 think reverse is a good place to start.

 Reverse is really cool and immediate to implement safely for a 
 lot of ranges I can think of. How would you define rfind 
 assuming reverse?


 Andrei

Ok then, imagine we use  reverse,  reversed and  merge primitives:

auto rfind( Range, E )( Range r , E e ) if( blablabla... ) {
   auto original = r.save;
   auto reversed = r.reverse;
   auto found = find( reversed, e );
   auto res = Range.merge( found, original );
   return original.reversed ? res : res.reverse;
}

And correcting what I said previously:

void reverse() {
  _reversed = !reversed;
}

Jan 15 2013

"Phil Lavoie" <maidenphil hotmail.com> writes:

   return original.reversed ? res : res.reverse;

And correcting again, invert both to:
  return original.reversed? res.reverse: res;

Jan 15 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 1/15/13 12:24 PM, Phil Lavoie wrote:
 Ok then, imagine we use  reverse,  reversed and  merge primitives:

That's too many. Simpler approaches?

Andrei

Jan 15 2013

FG <home fgda.pl> writes:

On 2013-01-15 19:00, Andrei Alexandrescu wrote:
 That's too many. Simpler approaches?


Let me give an outside perspective of someone that doesn't work with D all day.
For forward ranges the original method is fine, but for bidirectional r and e I 
would expect the following code to somehow just work, but it doesn't:

     auto rfind(R1 r, R2 e) {
         return retro(findSplitAfter(retro(r), retro(e))[0]);
     }

Other than the tiny detail of this not working, existing stuff like findSplit, 
findSplitBefore and findSplitAfter would be enough to define their r* versions.

Jan 15 2013

"monarch_dodra" <monarchdodra gmail.com> writes:

On Tuesday, 15 January 2013 at 19:44:39 UTC, FG wrote:
 On 2013-01-15 19:00, Andrei Alexandrescu wrote:
 That's too many. Simpler approaches?


 Let me give an outside perspective of someone that doesn't work 
 with D all day.
 For forward ranges the original method is fine, but for 
 bidirectional r and e I would expect the following code to 
 somehow just work, but it doesn't:

     auto rfind(R1 r, R2 e) {
         return retro(findSplitAfter(retro(r), retro(e))[0]);
     }

 Other than the tiny detail of this not working, existing stuff 
 like findSplit, findSplitBefore and findSplitAfter would be 
 enough to define their r* versions.

Well, you just stumbled on the entire problem.

Given a bidirectional range BR, findSplitBefore actually cheats. 
This is the basic algorithm: popFront until you find the what you 
are looking for. Return the found range, as well as "what was 
before"

The problem is that there is no real way to return "what was 
before", so we use a tool called "Take", that adapts a range to 
only hold the first N elements. Basically, it returns "take(br, 
n)". The problem though is that:
a) The type is not BR anymore, but Take!BR
b) Take!BR is not bidirectional.

So basically:
//--------
BR br = BR(1, 2, 3, 4);
result = br.findSplitAfter([3]);
auto lhs = result[0]; //[1, 2]
auto rhs = result[1]; //[4]
//typeof(lhs): Take!BR
//typeof(rhs): BR
//--------

And this is the root problem. As you can see, element 0 is not 
actually a bidirectional range anymore, nor is it of type BR 
either. This "tiny detail" as you say, is actually fundamental ;)

This is why I'd like to find a way to cut bidirectional ranges: 
Not just for r* functions, but also for our everyday functions: 
As you can see, we don't have any way to "findBefore" and store 
the result in a BR. Ouch.

As a user of D, I agree with you that it should "just work". Real 
life is different though :/

Jan 15 2013

FG <home fgda.pl> writes:

On 2013-01-15 21:14, monarch_dodra wrote:
 This is why I'd like to find a way to cut bidirectional ranges: Not just for r*
 functions, but also for our everyday functions: As you can see, we don't have
 any way to "findBefore" and store the result in a BR. Ouch.

 As a user of D, I agree with you that it should "just work". Real life is
 different though :/


Ah, right. That Take has bitten me before when I was discussing string slicing 
using negative indexes. :)

In that case I wish this cutting that preserves the type of range will get 
implemented and the cheats in findSplit removed.

Jan 15 2013

"Phil Lavoie" <maidenphil hotmail.com> writes:

Ok, so to sum it up:
rfind should return the same type as provided.
Its direction (popFront()) should be the same.

auto rfing( Range, E )( Range r, E e ) if (blablabla) {
   auto original = r.save; //This is to remember the end of the 
orginal range.
   r.reverse; //Reverse it. The range keeps track of its direction.
   auto found = find( r, e ); //Now front() shall give us e, and 
popFront() moves toward original's beginning.
   auto res = Range.merge( found, original ); //Take the start of 
found, which is on e and take the end of original, which has not 
moved.
   //Where does popFront() goes now? Depends: either expect merge 
to decide for us or make sure its just dumb.
   //If it is dumb, then the assumption we make about the 
direction of the returned range is that popFront() goes in the 
not reversed direction (reversed returns false). Therefore, we 
have start on e and end where we want it, we only need to move in 
the proper direction, which is the one of the original.
   if( original.reversed ) { res.reverse; }
   return res;
}

struct DListRange {
   Node * _start;
   Node * _end; //Initialized to the lists'end.
   bool _reversed = false;

   bool empty() { return _start is null || _end is null; } //The 
user has seen it all.

   void popBack() {
     if( _reversed ) {
      _start = _start.next;
     } else {
      _end = _end.previous;
     }
   }

   ...

    property bool reversed() { return _reversed; }
   void reverse() { _reversed = !_reversed; }

   //Dumb version.
   static typeof( this ) merge( typeof( this ) sr, typeof( this ) 
er ) {
    return typeof( this )( sr._start, er._end );
   }
}

Jan 15 2013

"Phil Lavoie" <maidenphil hotmail.com> writes:

   bool empty() { return _start is null || _end is null; } //The

This would have to be corrected.

I agree that there are too many primitives.

Jan 15 2013

"Phil Lavoie" <maidenphil hotmail.com> writes:

 The order of the range returned is the same as the one received.

Not true. It should default to the genuine bidirectional range's 
direction.
If it was reversed before calling rfind we could: let the user 
reverse the result again or add the possibility to query the 
range to know it it's already been reversed (compared to original 
ordering):

 property bool reversed() {}

In fact, this is to be known anyways if the range has to keep 
internal data to know in which direction it has to move. Meaning, 
reverse should be implemented like that:

VOID reverse() {
  _reversed = true;
}

instead of

ReversedBidirectionalRange reverse() {
   return blablabla...
}

There truly are multiple ways to implement a solution to this 
problem. We could also have merge decide the direction of the 
range based on the "end" range (since we are moving towards this 
one).

Jan 15 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Tue, Jan 15, 2013 at 06:03:06PM +0100, Phil Lavoie wrote:
[...]
 An addition to this solution could be another new primitive:
 reverse:
 
 Pseudo:
   original = range.save
   reversed = range.reverse //Permanently invert start an end. I
 think this is feasible for all bidirectional ranges. Still a
 bidirectional range.

[...]

I like this very much, actually. I think .reverse is much more useful
than .back and .popBack. I mean, how many algorithms actually use .back
and .popBack? Probably less than a handful, if any. Most algorithms use
.front and popFront().

Viewed in that light, what then is a bidirectional range, if not a range
where you can swap the direction of iteration? That is, we can redefine
a bidirectional range to be one that implements .reverse, with the
property that R.reverse.reverse == R. Get rid of .back and .popBack,
which hardly anybody uses anyway. It simplifies the range API
significantly, and makes rfind implementable in terms of primitives.


T

-- 
Ruby is essentially Perl minus Wall.

Jan 15 2013

"Phil Lavoie" <maidenphil hotmail.com> writes:

On Tuesday, 15 January 2013 at 18:19:48 UTC, H. S. Teoh wrote:
 On Tue, Jan 15, 2013 at 06:03:06PM +0100, Phil Lavoie wrote:
 [...]
 An addition to this solution could be another new primitive:
 reverse:
 
 Pseudo:
   original = range.save
   reversed = range.reverse //Permanently invert start an end. I
 think this is feasible for all bidirectional ranges. Still a
 bidirectional range.

 [...]

 I like this very much, actually. I think .reverse is much more 
 useful
 than .back and .popBack. I mean, how many algorithms actually 
 use .back
 and .popBack? Probably less than a handful, if any. Most 
 algorithms use
 .front and popFront().

 Viewed in that light, what then is a bidirectional range, if 
 not a range
 where you can swap the direction of iteration? That is, we can 
 redefine
 a bidirectional range to be one that implements .reverse, with 
 the
 property that R.reverse.reverse == R. Get rid of .back and 
 .popBack,
 which hardly anybody uses anyway. It simplifies the range API
 significantly, and makes rfind implementable in terms of 
 primitives.


 T

+1

I was about to suggest a solution based on Reversible ranges 
instead of bidirectional ranges, but I have yet to polish it.

Jan 15 2013

"Phil Lavoie" <maidenphil hotmail.com> writes:

And, if ever needed, popBack could become this:

void popBack( R )( ref R r ) if( isReversibleRange!R ) {
   auto reversed = r.reverse.popFront();
   r = reversed.reverse;
}

Jan 15 2013

"Phil Lavoie" <maidenphil hotmail.com> writes:

Continuing with reversible ranges:

struct ForwardRange {
  _start;
  _end;

  BackwardRange reverse() { return BackwardRange( _end, _start ); }
}

struct BackwardRange {
  _start;
  _end;

  //guess what reverse is.
}

That would give the power to mimick bidirectional ranges (see 
post on popBack before)

Now, we still need a way to move one of those pointers to a given 
place to fulfill rfind:

auto rfind( R, E )( R r, E e ) if( isReversibleRange!R ) {
  auto original = r.save; //For the end purpose;
  auto found = find( r.reverse, e ); //found is reversed and 
starts on e.
  //What we want now is a range of type typeof( original ) 
starting on found but ending on original. Therefore, we need an 
additional primitive. Suggestion: jumpTo.
  return original.jumpTo( found ); //Keeps the ordering, only move 
the start.
}

struct ForwardRange {
...
void jumpTo( ForwardRange r )( _start = r._start; }
void jumpTo( BackwardRange r )( _start = r._start; }

}

Jan 15 2013

"Phil Lavoie" <maidenphil hotmail.com> writes:

It'd be nicer if we could use an operator, but this is wishful 
thinking:

auto r = r2.jumpTo( d2 ); //r2d2, yes.
Equivalent to:
auto r = r2[ d2 .. $ ];
Or:
auto r = r2[ d2 ... ];

Again, wishful thinking.

Jan 15 2013

"Phil Lavoie" <maidenphil hotmail.com> writes:

On Tuesday, 15 January 2013 at 19:11:43 UTC, Phil Lavoie wrote:
 It'd be nicer if we could use an operator, but this is wishful 
 thinking:

 auto r = r2.jumpTo( d2 ); //r2d2, yes.
 Equivalent to:
 auto r = r2[ d2 .. $ ];
 Or:
 auto r = r2[ d2 ... ];

 Again, wishful thinking.

What about startOn instead?

That makes two additional primitives and one additional range 
type( -Bidirectional + Reversible + StartableOn??!?!??!?!?!)

Jan 15 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 1/15/13 2:07 PM, Phil Lavoie wrote:
 Continuing with reversible ranges:

I don't think .reverse will take us far enough. Won't work with arrays 
which kinda puts a monkey wrench into everything. r1.before(r2) works 
much better:

R rfind(R, E)(R r, E e) {
   auto original = r.save;
   for (; !r.empty; r.popBack) {
     if (r.endsWith(e)) break;
   }
   return original.before(r);
}

The generic implementation of before is trivial:

auto before(R)(R theBuck, R stopsHere) {
   static struct Result {
     bool empty() { return r1 is r2; }
     auto ref front() { return r1.front; }
     void popFront() { r1.popFront; }
     private R r1, r2;
   }
   return Result(theBuck, stopsHere);
}

For random-access ranges:

auto before(R)(R theBuck, R stopsHere) {
   return theBuck[0 .. theBuck.length - stopsHere.length];
}

(Glossing over checks and balances etc.)

Bidirectional ranges would need to implement before natively to return 
the same type as the original range.

The question is whether before() is enough for many other algorithms 
we'd want to define.


Andrei

Jan 15 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Tue, Jan 15, 2013 at 03:59:17PM -0500, Andrei Alexandrescu wrote:
 On 1/15/13 2:07 PM, Phil Lavoie wrote:
Continuing with reversible ranges:

 
 I don't think .reverse will take us far enough. Won't work with
 arrays which kinda puts a monkey wrench into everything.

[...]

Not really. We can define an array wrapper with a .reverse that returns
the original array, something like:

	// In std.array
	auto reverse(R)(R array)
		if (is(R _ : E[], E)
	{
		static struct ReverseArray {
			R[] src;

			 property auto front() { return src[$-1]; }
			 property bool empty() { return src.empty; }
			 property void popFront() {
				src = src[0..$-1];
			}
			 property auto reverse() {
				return src;
			}
			... // other wrapper methods
		}
		return ReverseArray(array);
	}

This will let you implement rfind in a way that returns the original
array.


T

-- 
Not all rumours are as misleading as this one.

Jan 15 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 1/15/13 4:22 PM, H. S. Teoh wrote:
 On Tue, Jan 15, 2013 at 03:59:17PM -0500, Andrei Alexandrescu wrote:
 On 1/15/13 2:07 PM, Phil Lavoie wrote:
 Continuing with reversible ranges:

 I don't think .reverse will take us far enough. Won't work with
 arrays which kinda puts a monkey wrench into everything.

 [...]

 Not really. We can define an array wrapper with a .reverse that returns
 the original array

Yah, there's retro already. My point is .reverse is not powerful enough. 
 From what I can tell .before is.

Andrei

Jan 15 2013

"monarch_dodra" <monarchdodra gmail.com> writes:

On Tuesday, 15 January 2013 at 21:24:19 UTC, H. S. Teoh wrote:
 On Tue, Jan 15, 2013 at 03:59:17PM -0500, Andrei Alexandrescu 
 wrote:
 On 1/15/13 2:07 PM, Phil Lavoie wrote:
Continuing with reversible ranges:

 
 I don't think .reverse will take us far enough. Won't work with
 arrays which kinda puts a monkey wrench into everything.

 [...]

 Not really. We can define an array wrapper with a .reverse that 
 returns
 the original array, something like:

 	// In std.array
 	auto reverse(R)(R array)
 		if (is(R _ : E[], E)
 	{
 		static struct ReverseArray {
 			R[] src;

 			 property auto front() { return src[$-1]; }
 			 property bool empty() { return src.empty; }
 			 property void popFront() {
 				src = src[0..$-1];
 			}
 			 property auto reverse() {
 				return src;
 			}
 			... // other wrapper methods
 		}
 		return ReverseArray(array);
 	}

 This will let you implement rfind in a way that returns the 
 original
 array.


 T

I'm having trouble understanding what you guys are going on 
about. Isn't this "reverse" just retro? And how would it solve 
anything, when retro doesn't?

I thought the point of "reverse" was that it preserved type, but 
this adapter doesn't do that...

Sorry for being dense, but I don't get it :/ I may have missed 
something earlier in the thread. Could you elaborate on the 
solution...?

Jan 15 2013

"monarch_dodra" <monarchdodra gmail.com> writes:

On Tuesday, 15 January 2013 at 20:59:17 UTC, Andrei Alexandrescu 
wrote:
 On 1/15/13 2:07 PM, Phil Lavoie wrote:
 Continuing with reversible ranges:

 I don't think .reverse will take us far enough. Won't work with 
 arrays which kinda puts a monkey wrench into everything. 
 r1.before(r2) works much better:

 R rfind(R, E)(R r, E e) {
   auto original = r.save;
   for (; !r.empty; r.popBack) {
     if (r.endsWith(e)) break;
   }
   return original.before(r);
 }

 The generic implementation of before is trivial:

 auto before(R)(R theBuck, R stopsHere) {
   static struct Result {
     bool empty() { return r1 is r2; }
     auto ref front() { return r1.front; }
     void popFront() { r1.popFront; }
     private R r1, r2;
   }
   return Result(theBuck, stopsHere);
 }

 For random-access ranges:

 auto before(R)(R theBuck, R stopsHere) {
   return theBuck[0 .. theBuck.length - stopsHere.length];
 }

 (Glossing over checks and balances etc.)

 Bidirectional ranges would need to implement before natively to 
 return the same type as the original range.

 The question is whether before() is enough for many other 
 algorithms we'd want to define.


 Andrei

But the goal was having a range that start at needle, and ends at 
haystackEnd. Your implementation doesn't do that. It would have 
to use a "after" instead of a "before".

Your "before" proposal is similar to a take, but instead of being 
"index" based, it is "sentinel" based. This does have its 
strengths too, but doesn't solve the problem.

The problem is that "after", just like the would be "takeBack", 
can't offer the front primitive for bidirectional ranges.

I think that long story short, and regardless of return types:
Unless the range has some sort of built-in primitive that allows 
some form of extracting a subrange from it, there is no way to 
obtain a range from the back of a bidirectional range.

Jan 15 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 1/15/13 4:57 PM, monarch_dodra wrote:
 On Tuesday, 15 January 2013 at 20:59:17 UTC, Andrei Alexandrescu wrote:

[snip]
 But the goal was having a range that start at needle, and ends at
 haystackEnd. Your implementation doesn't do that. It would have to use a
 "after" instead of a "before".

Sorry, got ahead of myself. r.retro.before would be needed, which 
requires r.after. I'd like to only add one primitive if at all possible, 
or have a solid proof that two are needed. Back to the drawing board...

Andrei

Jan 15 2013

"deadalnix" <deadalnix gmail.com> writes:

On Tuesday, 15 January 2013 at 22:49:45 UTC, Andrei Alexandrescu 
wrote:
 On 1/15/13 4:57 PM, monarch_dodra wrote:
 On Tuesday, 15 January 2013 at 20:59:17 UTC, Andrei 
 Alexandrescu wrote:

 [snip]
 But the goal was having a range that start at needle, and ends 
 at
 haystackEnd. Your implementation doesn't do that. It would 
 have to use a
 "after" instead of a "before".

 Sorry, got ahead of myself. r.retro.before would be needed, 
 which requires r.after. I'd like to only add one primitive if 
 at all possible, or have a solid proof that two are needed. 
 Back to the drawing board...

 Andrei

I think the whole point of reverse is to avoid duplicating 
before/after front/back popFront/popBack etc . . .

Jan 15 2013

FG <home fgda.pl> writes:

On 2013-01-15 21:59, Andrei Alexandrescu wrote:
 The generic implementation of before is trivial:

 auto before(R)(R theBuck, R stopsHere) {
    static struct Result {
      bool empty() { return r1 is r2; }
      auto ref front() { return r1.front; }
      void popFront() { r1.popFront; }
      private R r1, r2;
    }
    return Result(theBuck, stopsHere);
 }

I'm scared and would feel safer with:
bool empty() { return r1 is r2 || r1.empty(); }


 For random-access ranges:

 auto before(R)(R theBuck, R stopsHere) {
    return theBuck[0 .. theBuck.length - stopsHere.length];
 }

This implies that theBuck._end == stopsHere._end, but why?

Jan 15 2013

D Programming

C/C++ Programming

Other

digitalmars.D - The rfind challenge