www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - The rfind challenge

reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
(Warning: this assumes a fair bit of background with ranges and the STL.)

We've had a long-standing weakness of ranges when compared with STL 
iterators: in a find() call we get to access the balance of the range 
from the found position (if any) through the end of the initial range. 
In contrast, STL's find returns an iterator that can be combined with 
the beginning of the initial range or the end of the initial range, thus 
offering two ranges instead of one.

D offers the findSplit family which partially deals with this, but in an 
arguably imperfect way because the range from the beginning of the 
initial range to the found element has a different type than the initial 
range.

In spite of that, things have worked out well.

Nowadays I've been thinking (inspired by an exchange on a C=+ mailing 
list) that STL's rfind (http://goo.gl/Btg8s) is a real challenge for D's 
ranges.

In D, r.rfind(e) should return a range starting with the last occurrence 
of e in r and spanning through the rest of r.

If r is a forward range but not better, this is rather simple:

R rfind(R, E)(R range, E element)
{
     for (;;)
     {
         auto ahead = range.save.find(element);
         if (ahead.empty) return range;
         range = ahead;
     }
}

If the range is random-access, the algorithm is easy to implement 
efficiently by scanning from the end and using indexing.

The tricky case is with bidirectional range. I was unable to define an 
algorithm using the current primitives that: (a) scans the range from 
the end, and (b) returns a range from the found position through the end.

Looks like rfind() would need one extra primitive for bidirectional 
ranges. What would be a good one? I have a few starters, but don't want 
to bias anyone. Please chime in!


Thanks,

Andrei
Jan 14 2013
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/15/13 12:51 AM, Andrei Alexandrescu wrote:
 Nowadays I've been thinking (inspired by an exchange on a C=+ mailing
s/C=+/C++/ Andrei
Jan 14 2013
parent "Mehrdad" <wfunction hotmail.com> writes:
Solution: add a "subtraction" operator/method, which, when given 
a sub-range of a super-range, returns the portion of the 
super-range before and after the sub-range.
Jan 15 2013
prev sibling next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Tue, Jan 15, 2013 at 12:51:16AM -0500, Andrei Alexandrescu wrote:
[...]
 Nowadays I've been thinking (inspired by an exchange on a C=+
 mailing list) that STL's rfind (http://goo.gl/Btg8s) is a real
 challenge for D's ranges.
 
 In D, r.rfind(e) should return a range starting with the last
 occurrence of e in r and spanning through the rest of r.
 
 If r is a forward range but not better, this is rather simple:
[...]
 If the range is random-access, the algorithm is easy to implement
 efficiently by scanning from the end and using indexing.
 
 The tricky case is with bidirectional range. I was unable to define
 an algorithm using the current primitives that: (a) scans the range
 from the end, and (b) returns a range from the found position
 through the end.
Hmm. What about introducing a new primitive takeUntil, that returns the initial segment of the range until a given predicate becomes true? Then you could implement rfind thus: auto rfind(alias pred, R)(R range) if (isBidirectionalRange!R) { return range.retro() .takeUntil!pred() .retro(); } T -- Answer: Because it breaks the logical sequence of discussion. / Question: Why is top posting bad?
Jan 14 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/15/13 1:31 AM, H. S. Teoh wrote:
 Hmm. What about introducing a new primitive takeUntil, that returns the
 initial segment of the range until a given predicate becomes true? Then
 you could implement rfind thus:

 	auto rfind(alias pred, R)(R range)
 		if (isBidirectionalRange!R)
 	{
 		return range.retro()
 			.takeUntil!pred()
 			.retro();
 	}
That works, but returns a different type. Ideally rfind would return the same type as the initial range, R. Andrei
Jan 15 2013
next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Tue, Jan 15, 2013 at 07:54:12AM -0500, Andrei Alexandrescu wrote:
 On 1/15/13 1:31 AM, H. S. Teoh wrote:
Hmm. What about introducing a new primitive takeUntil, that returns the
initial segment of the range until a given predicate becomes true? Then
you could implement rfind thus:

	auto rfind(alias pred, R)(R range)
		if (isBidirectionalRange!R)
	{
		return range.retro()
			.takeUntil!pred()
			.retro();
	}
That works, but returns a different type. Ideally rfind would return the same type as the initial range, R.
[...] Hmm. Then it is indeed impossible with the current primitives, because to iterate from the back, one would have to pop off the back, so by definition you can't reach the back anymore. T -- Государство делает вид, что платит нам зарплату, а мы делаем вид, что работаем.
Jan 15 2013
prev sibling parent reply "deadalnix" <deadalnix gmail.com> writes:
On Tuesday, 15 January 2013 at 12:54:12 UTC, Andrei Alexandrescu 
wrote:
 On 1/15/13 1:31 AM, H. S. Teoh wrote:
 Hmm. What about introducing a new primitive takeUntil, that 
 returns the
 initial segment of the range until a given predicate becomes 
 true? Then
 you could implement rfind thus:

 	auto rfind(alias pred, R)(R range)
 		if (isBidirectionalRange!R)
 	{
 		return range.retro()
 			.takeUntil!pred()
 			.retro();
 	}
That works, but returns a different type. Ideally rfind would return the same type as the initial range, R. Andrei
Retro of retro is supposed to give back the original range, so I don't think it does.
Jan 15 2013
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/15/13 9:16 PM, deadalnix wrote:
 On Tuesday, 15 January 2013 at 12:54:12 UTC, Andrei Alexandrescu wrote:
 On 1/15/13 1:31 AM, H. S. Teoh wrote:
 Hmm. What about introducing a new primitive takeUntil, that returns the
 initial segment of the range until a given predicate becomes true? Then
 you could implement rfind thus:

 auto rfind(alias pred, R)(R range)
 if (isBidirectionalRange!R)
 {
 return range.retro()
 .takeUntil!pred()
 .retro();
 }
That works, but returns a different type. Ideally rfind would return the same type as the initial range, R. Andrei
Retro of retro is supposed to give back the original range, so I don't think it does.
There's a takeUntil in the middle. Andrei
Jan 15 2013
prev sibling next sibling parent reply "monarch_dodra" <monarchdodra gmail.com> writes:
On Tuesday, 15 January 2013 at 05:51:16 UTC, Andrei Alexandrescu 
wrote:
 (Warning: this assumes a fair bit of background with ranges and 
 the STL.)

 We've had a long-standing weakness of ranges when compared with 
 STL iterators: in a find() call we get to access the balance of 
 the range from the found position (if any) through the end of 
 the initial range. In contrast, STL's find returns an iterator 
 that can be combined with the beginning of the initial range or 
 the end of the initial range, thus offering two ranges instead 
 of one.

 D offers the findSplit family which partially deals with this, 
 but in an arguably imperfect way because the range from the 
 beginning of the initial range to the found element has a 
 different type than the initial range.

 In spite of that, things have worked out well.

 Nowadays I've been thinking (inspired by an exchange on a C=+ 
 mailing list) that STL's rfind (http://goo.gl/Btg8s) is a real 
 challenge for D's ranges.

 In D, r.rfind(e) should return a range starting with the last 
 occurrence of e in r and spanning through the rest of r.

 If r is a forward range but not better, this is rather simple:

 R rfind(R, E)(R range, E element)
 {
     for (;;)
     {
         auto ahead = range.save.find(element);
         if (ahead.empty) return range;
         range = ahead;
     }
 }
I'd hardly call that an ideal solution >:( The point of rfind is to not start the search from the start of the string. What if your input was: //---- string mobyDick = ...: auto r = rfind(mobyDick, ishmael); This may take a while... //---- That, and it would fail with rfind("A_bc_bc_bc_A", "bc_bc"): It would find "bc_bc_bc_A" instead of "bc_bc_A". The root cause is that a bidirectional range just can't be "cut"/"sliced" at a given point, whereas iterators allow this. As much as I love ranges, I think this is a fundamental flaw for anything less that hasSlicing. So far, the problem has stayed under the radar, thanks (because?) of take, but the problems it brings are clear: -Bias towards front iteration. -Type system warping. -Loss of bidirectionality when taken. Algorithms suffer because of this. Containers suffer because of this. I don't have any real solution to offer (there'd be a real complexity cost to any proposal), but I think there *needs* to be a way to provide "inter" slicing of ranges. Something along the lines of: //---- bidir a = [1, 2, 3, 4, 5, 6]; bidir b = a.drop(2).dropBack(2); //[3, 4]; auto l = cutBefore(a, b); //Gets the part of a that is before b auto r = cutAfter(a, b); //Gets the part of a that is after b static assert(typeof(a) == typeof(l)); static assert(typeof(a) == typeof(r)); assert(equal(l), [1, 2]); assert(equal(r), [5, 6]); //---- The advantage here is that: 1) consistent typing. 2) no direction bias. If we had this, then find could just return the slice containing the found element, and user could work his way from there. Problem is bloating the interface ... That, and we'd also need the versions that keep the cut element... //---- bidir a = [1, 2, 3, 4, 5, 6]; bidir b = a.drop(2).dropBack(2); //[3, 4]; auto l = mergeBefore(a, b); //Gets the part of a that is before b, with b auto r = mergeAfter(a, b); //Gets the part of a that is after b, with b assert(equal(l), [1, 2, 3, 4]); assert(equal(r), [3, 4, 5, 6]); //---- It is a complicated problem, but one that is worth solving, IMO.
Jan 14 2013
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/15/13 2:20 AM, monarch_dodra wrote:
 On Tuesday, 15 January 2013 at 05:51:16 UTC, Andrei Alexandrescu wrote:
 R rfind(R, E)(R range, E element)
 {
 for (;;)
 {
 auto ahead = range.save.find(element);
 if (ahead.empty) return range;
 range = ahead;
 }
 }
I'd hardly call that an ideal solution >:( The point of rfind is to not start the search from the start of the string.
I may reply to the rest, but let me destroy this quickly: this is the implementation for a forward range that's not bidirectional. The challenge is indeed implementing rfind for bidirectional ranges that are not random (e.g. strings). Andrei
Jan 15 2013
parent "monarch_dodra" <monarchdodra gmail.com> writes:
On Tuesday, 15 January 2013 at 12:56:43 UTC, Andrei Alexandrescu 
wrote:
 On 1/15/13 2:20 AM, monarch_dodra wrote:
 On Tuesday, 15 January 2013 at 05:51:16 UTC, Andrei 
 Alexandrescu wrote:
 R rfind(R, E)(R range, E element)
 {
 for (;;)
 {
 auto ahead = range.save.find(element);
 if (ahead.empty) return range;
 range = ahead;
 }
 }
I'd hardly call that an ideal solution >:( The point of rfind is to not start the search from the start of the string.
I may reply to the rest, but let me destroy this quickly: this is the implementation for a forward range that's not bidirectional.
I'd argue against providing rfind at all if R is not bidirectional. But that's not the thread's main issue, so let's not focus on this.
 The challenge is indeed implementing rfind for bidirectional 
 ranges that are not random (e.g. strings).

 Andrei
Agreed. On a side note, it would be very easy if we agreed to return the subrange starting at the beginning of range, and ending after the last element. But that's really just avoiding the problem.
Jan 15 2013
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/15/13 2:20 AM, monarch_dodra wrote:
 auto l = cutBefore(a, b); //Gets the part of a that is before b
 auto r = cutAfter(a, b); //Gets the part of a that is after b
I think cutBefore is sufficient. No? Ideas for better names? Andrei
Jan 15 2013
parent "monarch_dodra" <monarchdodra gmail.com> writes:
On Tuesday, 15 January 2013 at 17:57:07 UTC, Andrei Alexandrescu 
wrote:
 On 1/15/13 2:20 AM, monarch_dodra wrote:
 auto l = cutBefore(a, b); //Gets the part of a that is before b
 auto r = cutAfter(a, b); //Gets the part of a that is after b
I think cutBefore is sufficient. No? Ideas for better names? Andrei
I've been trying to brainstorm a couple of ideas. Here is one: //---- Tuple(R, R) cut(R other); //Returns both parts of the //current range that are //Before and After other usage: //---- Tuple(R, R) slices = a.cut(b); //Cuts a in two pieces, relative to b. assert(equal(a, chain(slices[0], b, slices[1]))); From there, we adding just the "merge" primitive would give us: //---- R result = merge(b, slices[1]); The neat thing here is that we have consistent typing the entire algorithm through. Yay! That's two extra functions, which can provide slicing and merging for bidirectional ranges. And they can be provided without breaking anything existing. Not too shabby (IMO). This is still very sketchy of course, so don't destroy too hard ;)
Jan 15 2013
prev sibling next sibling parent reply FG <home fgda.pl> writes:
On 2013-01-15 06:51, Andrei Alexandrescu wrote:
 If r is a forward range but not better, this is rather simple:

 R rfind(R, E)(R range, E element)
 {
      for (;;)
      {
          auto ahead = range.save.find(element);
          if (ahead.empty) return range;
          range = ahead;
      }
 }
That example creates an infinite loop. Needs a popFront: R rfind(R, E)(R range, E element) { if (range.empty) return range; for (;;) { auto ahead = range.save; ahead.popFront(); ahead = ahead.find(element); if (ahead.empty) return range; range = ahead; } } Another thing: if the element is not found, I think an empty range should be returned and not the whole range.
Jan 15 2013
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/15/13 6:53 AM, FG wrote:
 On 2013-01-15 06:51, Andrei Alexandrescu wrote:
 If r is a forward range but not better, this is rather simple:

 R rfind(R, E)(R range, E element)
 {
 for (;;)
 {
 auto ahead = range.save.find(element);
 if (ahead.empty) return range;
 range = ahead;
 }
 }
That example creates an infinite loop. Needs a popFront: R rfind(R, E)(R range, E element) { if (range.empty) return range; for (;;) { auto ahead = range.save; ahead.popFront(); ahead = ahead.find(element); if (ahead.empty) return range; range = ahead; } } Another thing: if the element is not found, I think an empty range should be returned and not the whole range.
Thanks for the fixes! Andrei
Jan 15 2013
prev sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 01/15/2013 12:53 PM, FG wrote:
 On 2013-01-15 06:51, Andrei Alexandrescu wrote:
 If r is a forward range but not better, this is rather simple:

 R rfind(R, E)(R range, E element)
 {
      for (;;)
      {
          auto ahead = range.save.find(element);
          if (ahead.empty) return range;
          range = ahead;
      }
 }
That example creates an infinite loop. Needs a popFront: R rfind(R, E)(R range, E element) { if (range.empty) return range; for (;;) { auto ahead = range.save; ahead.popFront(); ahead = ahead.find(element); if (ahead.empty) return range; range = ahead; } }
That is buggy too.
 Another thing: if the element is not found, I think an empty range
 should be returned and not the whole range.
Yup. R rfind(R, E)(R range,E element){ auto r=range.find(element); if(r.empty) return r; for(;;){ auto ahead=r.save; ahead.popFront(); ahead=ahead.find(element); if(ahead.empty) return r; r=ahead; } } Full proof (assuming purity of the range primitives): /+pure+/ R tail(R)(R r)out(result){ assert(result=={ auto a=r; a.popFront(); return a; }()); }body{ auto a=r; a.popFront(); return a; } /+pure+/ R rfind(R, E)(R range,E element)out(result){ assert(range.find(element).empty&&result.empty||!range.find(element).empty&&!result.empty&&result.front==element&&result.tail.find(element).empty); }body{ auto r=range.find(element); assert(r==range.find(element)&&(r.empty||r.front==element)); if(r.empty){ assert(r==range.find(element)&&r.empty); return r; /+assert(result==range.find(element)&&result.empty); assert(range.find(element).empty&&result.empty);+/ } assert(r==range.find(element)&&!r.empty&&r.front==element); assert(!range.find(element).empty&&!r.empty&&r.front==element); for(;;){ assert(true&&!range.find(element).empty&&!r.empty&&r.front==element); assert(!range.find(element).empty&&!r.empty&&r.front==element&&!r.empty); auto ahead=r.save; assert(!range.find(element).empty&&!r.empty&&r.front==element&&!ahead.empty&&ahead==r.save); ahead.popFront(); assert(!range.find(element).empty&&!r.empty&&r.front==element&&ahead=={ auto ahead=r.save; ahead.popFront(); return ahead; }()); assert(!range.find(element).empty&&!r.empty&&r.front==element&&ahead==r.tail); ahead=ahead.find(element); assert(!range.find(element).empty&&!r.empty&&r.front==element&&(ahead.empty||ahead.front==element)&&ahead==r.tail.find(element)); if(ahead.empty){ assert(!range.find(element).empty&&ahead.empty&&!r.empty&&r.front==element&&ahead==r.tail.find(element)); return r; /+assert(!range.find(element).empty&&ahead.empty&&!result.empty&&result.front==element&&ahead==result.tail.find(element)); assert(!range.find(element).empty&&!result.empty&&result.front==element&&result.tail.find(element).empty);+/ } assert(!range.find(element).empty&&!ahead.empty&&ahead.front==element); r=ahead; assert(!range.find(element).empty&&!r.empty&&r.front==element); } assert(false); }
Jan 15 2013
next sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 01/15/2013 02:30 PM, Timon Gehr wrote:
 /+pure+/ R tail(R)(R r)out(result){
      assert(result=={
          auto a=r;
          a.popFront();
          return a;
      }());
 }body{
      auto a=r;
      a.popFront();
      return a;
 }
Obviously, this should be /+pure+/ R tail(R)(R r)out(result){ assert(result=={ auto a=r.save; a.popFront(); return a; }()); }body{ auto a=r.save; a.popFront(); return a; }
Jan 15 2013
prev sibling parent FG <home fgda.pl> writes:
On 2013-01-15 14:30, Timon Gehr wrote:
 That is buggy too.
Thanks for adding the necessary check for element being at position 0.
Jan 15 2013
prev sibling parent reply "Phil Lavoie" <maidenphil hotmail.com> writes:
On Tuesday, 15 January 2013 at 05:51:16 UTC, Andrei Alexandrescu 
wrote:
 (Warning: this assumes a fair bit of background with ranges and 
 the STL.)

 We've had a long-standing weakness of ranges when compared with 
 STL iterators: in a find() call we get to access the balance of 
 the range from the found position (if any) through the end of 
 the initial range. In contrast, STL's find returns an iterator 
 that can be combined with the beginning of the initial range or 
 the end of the initial range, thus offering two ranges instead 
 of one.

 D offers the findSplit family which partially deals with this, 
 but in an arguably imperfect way because the range from the 
 beginning of the initial range to the found element has a 
 different type than the initial range.

 In spite of that, things have worked out well.

 Nowadays I've been thinking (inspired by an exchange on a C=+ 
 mailing list) that STL's rfind (http://goo.gl/Btg8s) is a real 
 challenge for D's ranges.

 In D, r.rfind(e) should return a range starting with the last 
 occurrence of e in r and spanning through the rest of r.

 If r is a forward range but not better, this is rather simple:

 R rfind(R, E)(R range, E element)
 {
     for (;;)
     {
         auto ahead = range.save.find(element);
         if (ahead.empty) return range;
         range = ahead;
     }
 }

 If the range is random-access, the algorithm is easy to 
 implement efficiently by scanning from the end and using 
 indexing.

 The tricky case is with bidirectional range. I was unable to 
 define an algorithm using the current primitives that: (a) 
 scans the range from the end, and (b) returns a range from the 
 found position through the end.

 Looks like rfind() would need one extra primitive for 
 bidirectional ranges. What would be a good one? I have a few 
 starters, but don't want to bias anyone. Please chime in!


 Thanks,

 Andrei
Interesting problem. One possibility would be to add a CLASS primitive like this: Range.merge( Range start, Range end ) //Takes the start of the start and the end of end. Pseudo: original = range.save resPos = range.retro.find return Range.merge( resPos, original ) //DOES NOT WORK The problem with this solution is that: Retro returns a different range type and it would be more complicated to have the merge function work with this type. An addition to this solution could be another new primitive: reverse: Pseudo: original = range.save reversed = range.reverse //Permanently invert start an end. I think this is feasible for all bidirectional ranges. Still a bidirectional range. reversed.find( needle ) //In haystack :) return Range.merge( reversed, original ) //Ok, start of reversed is on needle and end of original hasn't moved. The order of the range returned is the same as the one received. This is just a suggestion and any primitive could be renamed. However, I think reverse is a good place to start. Phil
Jan 15 2013
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/15/13 12:03 PM, Phil Lavoie wrote:
 Pseudo:
 original = range.save
 resPos = range.retro.find
 return Range.merge( resPos, original ) //DOES NOT WORK

 The problem with this solution is that:
 Retro returns a different range type and it would be more complicated to
 have the merge function work with this type.
This would be difficult to implement safely, which is an important disadvantage.
 An addition to this solution could be another new primitive: reverse:

 Pseudo:
 original = range.save
 reversed = range.reverse //Permanently invert start an end. I think this
 is feasible for all bidirectional ranges. Still a bidirectional range.
 reversed.find( needle ) //In haystack :)
 return Range.merge( reversed, original ) //Ok, start of reversed is on
 needle and end of original hasn't moved. The order of the range returned
 is the same as the one received.

 This is just a suggestion and any primitive could be renamed. However, I
 think reverse is a good place to start.
Reverse is really cool and immediate to implement safely for a lot of ranges I can think of. How would you define rfind assuming reverse? Andrei
Jan 15 2013
parent reply "Phil Lavoie" <maidenphil hotmail.com> writes:
On Tuesday, 15 January 2013 at 17:14:32 UTC, Andrei Alexandrescu 
wrote:
 On 1/15/13 12:03 PM, Phil Lavoie wrote:
 Pseudo:
 original = range.save
 resPos = range.retro.find
 return Range.merge( resPos, original ) //DOES NOT WORK

 The problem with this solution is that:
 Retro returns a different range type and it would be more 
 complicated to
 have the merge function work with this type.
This would be difficult to implement safely, which is an important disadvantage.
 An addition to this solution could be another new primitive: 
 reverse:

 Pseudo:
 original = range.save
 reversed = range.reverse //Permanently invert start an end. I 
 think this
 is feasible for all bidirectional ranges. Still a 
 bidirectional range.
 reversed.find( needle ) //In haystack :)
 return Range.merge( reversed, original ) //Ok, start of 
 reversed is on
 needle and end of original hasn't moved. The order of the 
 range returned
 is the same as the one received.

 This is just a suggestion and any primitive could be renamed. 
 However, I
 think reverse is a good place to start.
Reverse is really cool and immediate to implement safely for a lot of ranges I can think of. How would you define rfind assuming reverse? Andrei
Ok then, imagine we use reverse, reversed and merge primitives: auto rfind( Range, E )( Range r , E e ) if( blablabla... ) { auto original = r.save; auto reversed = r.reverse; auto found = find( reversed, e ); auto res = Range.merge( found, original ); return original.reversed ? res : res.reverse; } And correcting what I said previously: void reverse() { _reversed = !reversed; }
Jan 15 2013
next sibling parent "Phil Lavoie" <maidenphil hotmail.com> writes:
   return original.reversed ? res : res.reverse;
And correcting again, invert both to: return original.reversed? res.reverse: res;
Jan 15 2013
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/15/13 12:24 PM, Phil Lavoie wrote:
 Ok then, imagine we use  reverse,  reversed and  merge primitives:
That's too many. Simpler approaches? Andrei
Jan 15 2013
parent reply FG <home fgda.pl> writes:
On 2013-01-15 19:00, Andrei Alexandrescu wrote:
 That's too many. Simpler approaches?
Let me give an outside perspective of someone that doesn't work with D all day. For forward ranges the original method is fine, but for bidirectional r and e I would expect the following code to somehow just work, but it doesn't: auto rfind(R1 r, R2 e) { return retro(findSplitAfter(retro(r), retro(e))[0]); } Other than the tiny detail of this not working, existing stuff like findSplit, findSplitBefore and findSplitAfter would be enough to define their r* versions.
Jan 15 2013
parent reply "monarch_dodra" <monarchdodra gmail.com> writes:
On Tuesday, 15 January 2013 at 19:44:39 UTC, FG wrote:
 On 2013-01-15 19:00, Andrei Alexandrescu wrote:
 That's too many. Simpler approaches?
Let me give an outside perspective of someone that doesn't work with D all day. For forward ranges the original method is fine, but for bidirectional r and e I would expect the following code to somehow just work, but it doesn't: auto rfind(R1 r, R2 e) { return retro(findSplitAfter(retro(r), retro(e))[0]); } Other than the tiny detail of this not working, existing stuff like findSplit, findSplitBefore and findSplitAfter would be enough to define their r* versions.
Well, you just stumbled on the entire problem. Given a bidirectional range BR, findSplitBefore actually cheats. This is the basic algorithm: popFront until you find the what you are looking for. Return the found range, as well as "what was before" The problem is that there is no real way to return "what was before", so we use a tool called "Take", that adapts a range to only hold the first N elements. Basically, it returns "take(br, n)". The problem though is that: a) The type is not BR anymore, but Take!BR b) Take!BR is not bidirectional. So basically: //-------- BR br = BR(1, 2, 3, 4); result = br.findSplitAfter([3]); auto lhs = result[0]; //[1, 2] auto rhs = result[1]; //[4] //typeof(lhs): Take!BR //typeof(rhs): BR //-------- And this is the root problem. As you can see, element 0 is not actually a bidirectional range anymore, nor is it of type BR either. This "tiny detail" as you say, is actually fundamental ;) This is why I'd like to find a way to cut bidirectional ranges: Not just for r* functions, but also for our everyday functions: As you can see, we don't have any way to "findBefore" and store the result in a BR. Ouch. As a user of D, I agree with you that it should "just work". Real life is different though :/
Jan 15 2013
parent FG <home fgda.pl> writes:
On 2013-01-15 21:14, monarch_dodra wrote:
 This is why I'd like to find a way to cut bidirectional ranges: Not just for r*
 functions, but also for our everyday functions: As you can see, we don't have
 any way to "findBefore" and store the result in a BR. Ouch.

 As a user of D, I agree with you that it should "just work". Real life is
 different though :/
Ah, right. That Take has bitten me before when I was discussing string slicing using negative indexes. :) In that case I wish this cutting that preserves the type of range will get implemented and the cheats in findSplit removed.
Jan 15 2013
prev sibling parent reply "Phil Lavoie" <maidenphil hotmail.com> writes:
Ok, so to sum it up:
rfind should return the same type as provided.
Its direction (popFront()) should be the same.

auto rfing( Range, E )( Range r, E e ) if (blablabla) {
   auto original = r.save; //This is to remember the end of the 
orginal range.
   r.reverse; //Reverse it. The range keeps track of its direction.
   auto found = find( r, e ); //Now front() shall give us e, and 
popFront() moves toward original's beginning.
   auto res = Range.merge( found, original ); //Take the start of 
found, which is on e and take the end of original, which has not 
moved.
   //Where does popFront() goes now? Depends: either expect merge 
to decide for us or make sure its just dumb.
   //If it is dumb, then the assumption we make about the 
direction of the returned range is that popFront() goes in the 
not reversed direction (reversed returns false). Therefore, we 
have start on e and end where we want it, we only need to move in 
the proper direction, which is the one of the original.
   if( original.reversed ) { res.reverse; }
   return res;
}

struct DListRange {
   Node * _start;
   Node * _end; //Initialized to the lists'end.
   bool _reversed = false;

   bool empty() { return _start is null || _end is null; } //The 
user has seen it all.

   void popBack() {
     if( _reversed ) {
      _start = _start.next;
     } else {
      _end = _end.previous;
     }
   }

   ...

    property bool reversed() { return _reversed; }
   void reverse() { _reversed = !_reversed; }

   //Dumb version.
   static typeof( this ) merge( typeof( this ) sr, typeof( this ) 
er ) {
    return typeof( this )( sr._start, er._end );
   }
}
Jan 15 2013
parent "Phil Lavoie" <maidenphil hotmail.com> writes:
   bool empty() { return _start is null || _end is null; } //The
This would have to be corrected. I agree that there are too many primitives.
Jan 15 2013
prev sibling next sibling parent "Phil Lavoie" <maidenphil hotmail.com> writes:
 The order of the range returned is the same as the one received.
Not true. It should default to the genuine bidirectional range's direction. If it was reversed before calling rfind we could: let the user reverse the result again or add the possibility to query the range to know it it's already been reversed (compared to original ordering): property bool reversed() {} In fact, this is to be known anyways if the range has to keep internal data to know in which direction it has to move. Meaning, reverse should be implemented like that: VOID reverse() { _reversed = true; } instead of ReversedBidirectionalRange reverse() { return blablabla... } There truly are multiple ways to implement a solution to this problem. We could also have merge decide the direction of the range based on the "end" range (since we are moving towards this one).
Jan 15 2013
prev sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Tue, Jan 15, 2013 at 06:03:06PM +0100, Phil Lavoie wrote:
[...]
 An addition to this solution could be another new primitive:
 reverse:
 
 Pseudo:
   original = range.save
   reversed = range.reverse //Permanently invert start an end. I
 think this is feasible for all bidirectional ranges. Still a
 bidirectional range.
[...] I like this very much, actually. I think .reverse is much more useful than .back and .popBack. I mean, how many algorithms actually use .back and .popBack? Probably less than a handful, if any. Most algorithms use .front and popFront(). Viewed in that light, what then is a bidirectional range, if not a range where you can swap the direction of iteration? That is, we can redefine a bidirectional range to be one that implements .reverse, with the property that R.reverse.reverse == R. Get rid of .back and .popBack, which hardly anybody uses anyway. It simplifies the range API significantly, and makes rfind implementable in terms of primitives. T -- Ruby is essentially Perl minus Wall.
Jan 15 2013
parent reply "Phil Lavoie" <maidenphil hotmail.com> writes:
On Tuesday, 15 January 2013 at 18:19:48 UTC, H. S. Teoh wrote:
 On Tue, Jan 15, 2013 at 06:03:06PM +0100, Phil Lavoie wrote:
 [...]
 An addition to this solution could be another new primitive:
 reverse:
 
 Pseudo:
   original = range.save
   reversed = range.reverse //Permanently invert start an end. I
 think this is feasible for all bidirectional ranges. Still a
 bidirectional range.
[...] I like this very much, actually. I think .reverse is much more useful than .back and .popBack. I mean, how many algorithms actually use .back and .popBack? Probably less than a handful, if any. Most algorithms use .front and popFront(). Viewed in that light, what then is a bidirectional range, if not a range where you can swap the direction of iteration? That is, we can redefine a bidirectional range to be one that implements .reverse, with the property that R.reverse.reverse == R. Get rid of .back and .popBack, which hardly anybody uses anyway. It simplifies the range API significantly, and makes rfind implementable in terms of primitives. T
+1 I was about to suggest a solution based on Reversible ranges instead of bidirectional ranges, but I have yet to polish it.
Jan 15 2013
parent reply "Phil Lavoie" <maidenphil hotmail.com> writes:
And, if ever needed, popBack could become this:

void popBack( R )( ref R r ) if( isReversibleRange!R ) {
   auto reversed = r.reverse.popFront();
   r = reversed.reverse;
}
Jan 15 2013
parent reply "Phil Lavoie" <maidenphil hotmail.com> writes:
Continuing with reversible ranges:

struct ForwardRange {
  _start;
  _end;

  BackwardRange reverse() { return BackwardRange( _end, _start ); }
}

struct BackwardRange {
  _start;
  _end;

  //guess what reverse is.
}

That would give the power to mimick bidirectional ranges (see 
post on popBack before)

Now, we still need a way to move one of those pointers to a given 
place to fulfill rfind:

auto rfind( R, E )( R r, E e ) if( isReversibleRange!R ) {
  auto original = r.save; //For the end purpose;
  auto found = find( r.reverse, e ); //found is reversed and 
starts on e.
  //What we want now is a range of type typeof( original ) 
starting on found but ending on original. Therefore, we need an 
additional primitive. Suggestion: jumpTo.
  return original.jumpTo( found ); //Keeps the ordering, only move 
the start.
}

struct ForwardRange {
...
void jumpTo( ForwardRange r )( _start = r._start; }
void jumpTo( BackwardRange r )( _start = r._start; }

}
Jan 15 2013
next sibling parent reply "Phil Lavoie" <maidenphil hotmail.com> writes:
It'd be nicer if we could use an operator, but this is wishful 
thinking:

auto r = r2.jumpTo( d2 ); //r2d2, yes.
Equivalent to:
auto r = r2[ d2 .. $ ];
Or:
auto r = r2[ d2 ... ];

Again, wishful thinking.
Jan 15 2013
parent "Phil Lavoie" <maidenphil hotmail.com> writes:
On Tuesday, 15 January 2013 at 19:11:43 UTC, Phil Lavoie wrote:
 It'd be nicer if we could use an operator, but this is wishful 
 thinking:

 auto r = r2.jumpTo( d2 ); //r2d2, yes.
 Equivalent to:
 auto r = r2[ d2 .. $ ];
 Or:
 auto r = r2[ d2 ... ];

 Again, wishful thinking.
What about startOn instead? That makes two additional primitives and one additional range type( -Bidirectional + Reversible + StartableOn??!?!??!?!?!)
Jan 15 2013
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/15/13 2:07 PM, Phil Lavoie wrote:
 Continuing with reversible ranges:
I don't think .reverse will take us far enough. Won't work with arrays which kinda puts a monkey wrench into everything. r1.before(r2) works much better: R rfind(R, E)(R r, E e) { auto original = r.save; for (; !r.empty; r.popBack) { if (r.endsWith(e)) break; } return original.before(r); } The generic implementation of before is trivial: auto before(R)(R theBuck, R stopsHere) { static struct Result { bool empty() { return r1 is r2; } auto ref front() { return r1.front; } void popFront() { r1.popFront; } private R r1, r2; } return Result(theBuck, stopsHere); } For random-access ranges: auto before(R)(R theBuck, R stopsHere) { return theBuck[0 .. theBuck.length - stopsHere.length]; } (Glossing over checks and balances etc.) Bidirectional ranges would need to implement before natively to return the same type as the original range. The question is whether before() is enough for many other algorithms we'd want to define. Andrei
Jan 15 2013
next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Tue, Jan 15, 2013 at 03:59:17PM -0500, Andrei Alexandrescu wrote:
 On 1/15/13 2:07 PM, Phil Lavoie wrote:
Continuing with reversible ranges:
I don't think .reverse will take us far enough. Won't work with arrays which kinda puts a monkey wrench into everything.
[...] Not really. We can define an array wrapper with a .reverse that returns the original array, something like: // In std.array auto reverse(R)(R array) if (is(R _ : E[], E) { static struct ReverseArray { R[] src; property auto front() { return src[$-1]; } property bool empty() { return src.empty; } property void popFront() { src = src[0..$-1]; } property auto reverse() { return src; } ... // other wrapper methods } return ReverseArray(array); } This will let you implement rfind in a way that returns the original array. T -- Not all rumours are as misleading as this one.
Jan 15 2013
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/15/13 4:22 PM, H. S. Teoh wrote:
 On Tue, Jan 15, 2013 at 03:59:17PM -0500, Andrei Alexandrescu wrote:
 On 1/15/13 2:07 PM, Phil Lavoie wrote:
 Continuing with reversible ranges:
I don't think .reverse will take us far enough. Won't work with arrays which kinda puts a monkey wrench into everything.
[...] Not really. We can define an array wrapper with a .reverse that returns the original array
Yah, there's retro already. My point is .reverse is not powerful enough. From what I can tell .before is. Andrei
Jan 15 2013
prev sibling parent "monarch_dodra" <monarchdodra gmail.com> writes:
On Tuesday, 15 January 2013 at 21:24:19 UTC, H. S. Teoh wrote:
 On Tue, Jan 15, 2013 at 03:59:17PM -0500, Andrei Alexandrescu 
 wrote:
 On 1/15/13 2:07 PM, Phil Lavoie wrote:
Continuing with reversible ranges:
I don't think .reverse will take us far enough. Won't work with arrays which kinda puts a monkey wrench into everything.
[...] Not really. We can define an array wrapper with a .reverse that returns the original array, something like: // In std.array auto reverse(R)(R array) if (is(R _ : E[], E) { static struct ReverseArray { R[] src; property auto front() { return src[$-1]; } property bool empty() { return src.empty; } property void popFront() { src = src[0..$-1]; } property auto reverse() { return src; } ... // other wrapper methods } return ReverseArray(array); } This will let you implement rfind in a way that returns the original array. T
I'm having trouble understanding what you guys are going on about. Isn't this "reverse" just retro? And how would it solve anything, when retro doesn't? I thought the point of "reverse" was that it preserved type, but this adapter doesn't do that... Sorry for being dense, but I don't get it :/ I may have missed something earlier in the thread. Could you elaborate on the solution...?
Jan 15 2013
prev sibling next sibling parent reply "monarch_dodra" <monarchdodra gmail.com> writes:
On Tuesday, 15 January 2013 at 20:59:17 UTC, Andrei Alexandrescu 
wrote:
 On 1/15/13 2:07 PM, Phil Lavoie wrote:
 Continuing with reversible ranges:
I don't think .reverse will take us far enough. Won't work with arrays which kinda puts a monkey wrench into everything. r1.before(r2) works much better: R rfind(R, E)(R r, E e) { auto original = r.save; for (; !r.empty; r.popBack) { if (r.endsWith(e)) break; } return original.before(r); } The generic implementation of before is trivial: auto before(R)(R theBuck, R stopsHere) { static struct Result { bool empty() { return r1 is r2; } auto ref front() { return r1.front; } void popFront() { r1.popFront; } private R r1, r2; } return Result(theBuck, stopsHere); } For random-access ranges: auto before(R)(R theBuck, R stopsHere) { return theBuck[0 .. theBuck.length - stopsHere.length]; } (Glossing over checks and balances etc.) Bidirectional ranges would need to implement before natively to return the same type as the original range. The question is whether before() is enough for many other algorithms we'd want to define. Andrei
But the goal was having a range that start at needle, and ends at haystackEnd. Your implementation doesn't do that. It would have to use a "after" instead of a "before". Your "before" proposal is similar to a take, but instead of being "index" based, it is "sentinel" based. This does have its strengths too, but doesn't solve the problem. The problem is that "after", just like the would be "takeBack", can't offer the front primitive for bidirectional ranges. I think that long story short, and regardless of return types: Unless the range has some sort of built-in primitive that allows some form of extracting a subrange from it, there is no way to obtain a range from the back of a bidirectional range.
Jan 15 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/15/13 4:57 PM, monarch_dodra wrote:
 On Tuesday, 15 January 2013 at 20:59:17 UTC, Andrei Alexandrescu wrote:
[snip]
 But the goal was having a range that start at needle, and ends at
 haystackEnd. Your implementation doesn't do that. It would have to use a
 "after" instead of a "before".
Sorry, got ahead of myself. r.retro.before would be needed, which requires r.after. I'd like to only add one primitive if at all possible, or have a solid proof that two are needed. Back to the drawing board... Andrei
Jan 15 2013
parent "deadalnix" <deadalnix gmail.com> writes:
On Tuesday, 15 January 2013 at 22:49:45 UTC, Andrei Alexandrescu 
wrote:
 On 1/15/13 4:57 PM, monarch_dodra wrote:
 On Tuesday, 15 January 2013 at 20:59:17 UTC, Andrei 
 Alexandrescu wrote:
[snip]
 But the goal was having a range that start at needle, and ends 
 at
 haystackEnd. Your implementation doesn't do that. It would 
 have to use a
 "after" instead of a "before".
Sorry, got ahead of myself. r.retro.before would be needed, which requires r.after. I'd like to only add one primitive if at all possible, or have a solid proof that two are needed. Back to the drawing board... Andrei
I think the whole point of reverse is to avoid duplicating before/after front/back popFront/popBack etc . . .
Jan 15 2013
prev sibling parent FG <home fgda.pl> writes:
On 2013-01-15 21:59, Andrei Alexandrescu wrote:
 The generic implementation of before is trivial:

 auto before(R)(R theBuck, R stopsHere) {
    static struct Result {
      bool empty() { return r1 is r2; }
      auto ref front() { return r1.front; }
      void popFront() { r1.popFront; }
      private R r1, r2;
    }
    return Result(theBuck, stopsHere);
 }
I'm scared and would feel safer with: bool empty() { return r1 is r2 || r1.empty(); }
 For random-access ranges:

 auto before(R)(R theBuck, R stopsHere) {
    return theBuck[0 .. theBuck.length - stopsHere.length];
 }
This implies that theBuck._end == stopsHere._end, but why?
Jan 15 2013