www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Do sorted ranges have any special properties?

reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
I'm trying to find justifications for keeping assumeSorted and friends 
within Phobos. Background: assumeSorted(r) where r is some range returns 
a value that wraps r and clarifies to the caller that it can assume r 
has been sorted.

The obvious use case in Phobos is that find(haystack, needle) can 
proceed faster if haystack == assumeSorted(something).

The nicest way to go about this is to make the type returned by 
assumeSorted a kind of "range with benefits". That is, it would 
implement all range primitives, plus a few more primitives that are 
advantageous for sorted ranges. Question is: what are those? What kind 
of cool primitives could a sorted range define?

Here are a few I can think of:

find -> uses binary search if random-access, or at least early stopping 
otherwise.

minElement -> r.front

maxElement -> r.back

topN -> essentially does nothing

median -> r[r.length / 2]

Such functions could have free counterparts in std.algorithm. The free 
functions check whether the range already implements the fast functions 
and uses them transparently if present. So then we have e.g. find() 
which works in linear time for most ranges, but logarithmic time on 
random-access sorted ranges.

Other ideas?


Thanks,

Andrei
Jul 25 2010
next sibling parent reply Tomek =?UTF-8?B?U293acWEc2tp?= <just ask.me> writes:
Andrei Alexandrescu wrote:

 I'm trying to find justifications for keeping assumeSorted and friends
 within Phobos. Background: assumeSorted(r) where r is some range returns
 a value that wraps r and clarifies to the caller that it can assume r
 has been sorted.
 
 The obvious use case in Phobos is that find(haystack, needle) can
 proceed faster if haystack == assumeSorted(something).
 
 The nicest way to go about this is to make the type returned by
 assumeSorted a kind of "range with benefits". That is, it would
 implement all range primitives, plus a few more primitives that are
 advantageous for sorted ranges. Question is: what are those? What kind
 of cool primitives could a sorted range define?
 
 Here are a few I can think of:
 
 find -> uses binary search if random-access, or at least early stopping
 otherwise.
 
 minElement -> r.front
 
 maxElement -> r.back
 
 topN -> essentially does nothing
 
 median -> r[r.length / 2]
 
 Such functions could have free counterparts in std.algorithm. The free
 functions check whether the range already implements the fast functions
 and uses them transparently if present. So then we have e.g. find()
 which works in linear time for most ranges, but logarithmic time on
 random-access sorted ranges.

Sort of trivial, but I'd like to see the predicate with which it's sorted. Also, I can't stop thinking that it's stand-alone find()'s job utilize whatever features the range has (be it random access, sortedness, or anything else) to execute fast, not the passed in range's. Checking for a member find() sounds good but doesn't have anything specific to do with the range being sorted. Any range could define its find() to special-case on its structure. BTW, would it be a big deal if sort() returned assumeSorted!Range? Many uses are a re-sort + binary search combo. Tomek
Jul 25 2010
next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sun, 25 Jul 2010 15:44:36 -0400, Tomek Sowiński <just ask.me> wrote:

 Andrei Alexandrescu wrote:

 I'm trying to find justifications for keeping assumeSorted and friends
 within Phobos. Background: assumeSorted(r) where r is some range returns
 a value that wraps r and clarifies to the caller that it can assume r
 has been sorted.

 The obvious use case in Phobos is that find(haystack, needle) can
 proceed faster if haystack == assumeSorted(something).

 The nicest way to go about this is to make the type returned by
 assumeSorted a kind of "range with benefits". That is, it would
 implement all range primitives, plus a few more primitives that are
 advantageous for sorted ranges. Question is: what are those? What kind
 of cool primitives could a sorted range define?

 Here are a few I can think of:

 find -> uses binary search if random-access, or at least early stopping
 otherwise.

 minElement -> r.front

 maxElement -> r.back

 topN -> essentially does nothing

 median -> r[r.length / 2]

 Such functions could have free counterparts in std.algorithm. The free
 functions check whether the range already implements the fast functions
 and uses them transparently if present. So then we have e.g. find()
 which works in linear time for most ranges, but logarithmic time on
 random-access sorted ranges.

Sort of trivial, but I'd like to see the predicate with which it's sorted.

I think the "range" returned by assumeSorted has a predicate alias in it already (which defaults to "a < b")
 BTW, would it be a big deal if sort() returned assumeSorted!Range? Many  
 uses
 are a re-sort + binary search combo.

Yeah, that would be good. The only problem I see with this is that "sortedness" is broken quite easily in ways that does not remove the assumeSorted type. For example, if assumeSorted just forwards its underlying range's methods, then given an array arr: auto x = assumeSorted(arr.sort); x[0] = 999999; // probably no longer sorted x.find(999999); // probably doesn't work. And there are other subtle ways to break it in ways that you couldn't even program assumeSorted to deal with: arr[0] = 999999; // how does x detect this? I feel like assumeSorted is something that's temporary. It can only be used as a parameter or return decoration, but as soon as you assign it to a variable, it's gone. Not sure if that can work. User-defined attributes would be nice here :) -Steve
Jul 26 2010
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Steven Schveighoffer wrote:
 On Sun, 25 Jul 2010 15:44:36 -0400, Tomek Sowiński <just ask.me> wrote:
 
 Andrei Alexandrescu wrote:

 I'm trying to find justifications for keeping assumeSorted and friends
 within Phobos. Background: assumeSorted(r) where r is some range returns
 a value that wraps r and clarifies to the caller that it can assume r
 has been sorted.

 The obvious use case in Phobos is that find(haystack, needle) can
 proceed faster if haystack == assumeSorted(something).

 The nicest way to go about this is to make the type returned by
 assumeSorted a kind of "range with benefits". That is, it would
 implement all range primitives, plus a few more primitives that are
 advantageous for sorted ranges. Question is: what are those? What kind
 of cool primitives could a sorted range define?

 Here are a few I can think of:

 find -> uses binary search if random-access, or at least early stopping
 otherwise.

 minElement -> r.front

 maxElement -> r.back

 topN -> essentially does nothing

 median -> r[r.length / 2]

 Such functions could have free counterparts in std.algorithm. The free
 functions check whether the range already implements the fast functions
 and uses them transparently if present. So then we have e.g. find()
 which works in linear time for most ranges, but logarithmic time on
 random-access sorted ranges.

Sort of trivial, but I'd like to see the predicate with which it's sorted.

I think the "range" returned by assumeSorted has a predicate alias in it already (which defaults to "a < b")

We just need to do the STL trick of re-exposing the template argument from within the template: struct SortedRange(alias _less) { alias binaryFun!_less less; ... } Clients will use SomeType.less.
 BTW, would it be a big deal if sort() returned assumeSorted!Range? 
 Many uses
 are a re-sort + binary search combo.

Yeah, that would be good. The only problem I see with this is that "sortedness" is broken quite easily in ways that does not remove the assumeSorted type. For example, if assumeSorted just forwards its underlying range's methods, then given an array arr: auto x = assumeSorted(arr.sort); x[0] = 999999; // probably no longer sorted x.find(999999); // probably doesn't work.

I was considering disabling that by having SortedRange's operators return rvalues instead of lvalues.
 And there are other subtle ways to break it in ways that you couldn't 
 even program assumeSorted to deal with:
 
 arr[0] = 999999; // how does x detect this?

That's pretty brutal :o).
 I feel like assumeSorted is something that's temporary.  It can only be 
 used as a parameter or return decoration, but as soon as you assign it 
 to a variable, it's gone.  Not sure if that can work.  User-defined 
 attributes would be nice here :)

I agree that assumeSorted is somewhat evanescent. I'm still undecided whether it deserves a place in Phobos or not. Andrei
Jul 26 2010
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Tomek Sowin'ski wrote:
 Andrei Alexandrescu wrote:
 
 I'm trying to find justifications for keeping assumeSorted and friends
 within Phobos. Background: assumeSorted(r) where r is some range returns
 a value that wraps r and clarifies to the caller that it can assume r
 has been sorted.

 The obvious use case in Phobos is that find(haystack, needle) can
 proceed faster if haystack == assumeSorted(something).

 The nicest way to go about this is to make the type returned by
 assumeSorted a kind of "range with benefits". That is, it would
 implement all range primitives, plus a few more primitives that are
 advantageous for sorted ranges. Question is: what are those? What kind
 of cool primitives could a sorted range define?

 Here are a few I can think of:

 find -> uses binary search if random-access, or at least early stopping
 otherwise.

 minElement -> r.front

 maxElement -> r.back

 topN -> essentially does nothing

 median -> r[r.length / 2]

 Such functions could have free counterparts in std.algorithm. The free
 functions check whether the range already implements the fast functions
 and uses them transparently if present. So then we have e.g. find()
 which works in linear time for most ranges, but logarithmic time on
 random-access sorted ranges.

Sort of trivial, but I'd like to see the predicate with which it's sorted.

Thanks for your feedback. Yah, that's a given.
 Also, I can't stop thinking that it's stand-alone find()'s job utilize 
 whatever features the range has (be it random access, sortedness, or 
 anything else) to execute fast, not the passed in range's.

Good point, though that reintroduces the question of comparing find's predicate with SortedRange's predicate.
 Checking for a member find() sounds good but doesn't have anything specific 
 to do with the range being sorted. Any range could define its find() to 
 special-case on its structure.

And that's not bad I think. If the range feels it has defined enough structure to implement fast find, maybe find should defer to it. I'm unclear about the best strategy here.
 BTW, would it be a big deal if sort() returned assumeSorted!Range? Many uses 
 are a re-sort + binary search combo.

Yah, that would be great. Andrei
Jul 26 2010
next sibling parent Michel Fortin <michel.fortin michelf.com> writes:
On 2010-07-27 01:04:28 -0400, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 Tomek Sowin'ski wrote:
 Checking for a member find() sounds good but doesn't have anything 
 specific to do with the range being sorted. Any range could define its 
 find() to special-case on its structure.

And that's not bad I think. If the range feels it has defined enough structure to implement fast find, maybe find should defer to it. I'm unclear about the best strategy here.

That would be like implementing the uniform calling syntax for certain functions. I like it, except that it being only for certain functions breaks the 'uniform' thing. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Jul 27 2010
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Philippe Sigaud wrote:
 On Tue, Jul 27, 2010 at 07:04, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org <mailto:SeeWebsiteForEmail erdani.org>> 
 wrote:
 
 
         Also, I can't stop thinking that it's stand-alone find()'s job
         utilize whatever features the range has (be it random access,
         sortedness, or anything else) to execute fast, not the passed in
         range's.
 
 
     Good point, though that reintroduces the question of comparing
     find's predicate with SortedRange's predicate.
 
 
 I really don't see how you would do that in a generic way... Even taking 
 into account that predicates return a very simple value (bool) and that 
 they terminate (well, the input range's one _was_ used to sort it), 
 that's akin to determining if two unknown functions produce equal values.

Yah; it would help somehow if there was a way to get the body of a function as a void[]. The address already exists, we need the sizeof. Andrei
Jul 27 2010
parent reply Tomek =?UTF-8?B?U293acWEc2tp?= <just ask.me> writes:
Andrei Alexandrescu wrote:

 I really don't see how you would do that in a generic way... Even taking
 into account that predicates return a very simple value (bool) and that
 they terminate (well, the input range's one was used to sort it),
 that's akin to determining if two unknown functions produce equal values.

Yah; it would help somehow if there was a way to get the body of a function as a void[]. The address already exists, we need the sizeof.

Isn't dereferencing function pointers illegal? Tomek
Jul 27 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Tomek Sowiński wrote:
 Andrei Alexandrescu wrote:
 
 I really don't see how you would do that in a generic way... Even taking
 into account that predicates return a very simple value (bool) and that
 they terminate (well, the input range's one was used to sort it),
 that's akin to determining if two unknown functions produce equal values.

function as a void[]. The address already exists, we need the sizeof.

Isn't dereferencing function pointers illegal?

We need that too :o). Andrei
Jul 27 2010
parent Tomek =?UTF-8?B?U293acWEc2tp?= <just ask.me> writes:
Andrei Alexandrescu wrote:

  Isn't dereferencing function pointers illegal?
 
 We need that too :o).

Don't get it:) Need them to be dereferencable or not? BTW, time ago I posted that the expression &fun should be immutable and that function pointers should be treated as values because they can't be dereferenced. Maybe we need to revive this forgotten post: http://www.mail-archive.com/digitalmars-d puremagic.com/msg25104.html Tomek
Jul 27 2010
prev sibling parent Philippe Sigaud <philippe.sigaud gmail.com> writes:
--001636c5b4b8df7395048c5e5907
Content-Type: text/plain; charset=ISO-8859-1

On Tue, Jul 27, 2010 at 07:04, Andrei Alexandrescu <
SeeWebsiteForEmail erdani.org> wrote:

 Also, I can't stop thinking that it's stand-alone find()'s job utilize
 whatever features the range has (be it random access, sortedness, or
 anything else) to execute fast, not the passed in range's.

Good point, though that reintroduces the question of comparing find's predicate with SortedRange's predicate.

into account that predicates return a very simple value (bool) and that they terminate (well, the input range's one _was_ used to sort it), that's akin to determining if two unknown functions produce equal values. --001636c5b4b8df7395048c5e5907 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable <div class=3D"gmail_quote">On Tue, Jul 27, 2010 at 07:04, Andrei Alexandres= cu <span dir=3D"ltr">&lt;<a href=3D"mailto:SeeWebsiteForEmail erdani.org">S= eeWebsiteForEmail erdani.org</a>&gt;</span> wrote:<br><blockquote class=3D"= gmail_quote" style=3D"margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb= (204, 204, 204); padding-left: 1ex;"> <div><div></div><br></div><div class=3D"im"><blockquote class=3D"gmail_quot= e" style=3D"margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204,= 204); padding-left: 1ex;">Also, I can&#39;t stop thinking that it&#39;s st= and-alone find()&#39;s job utilize whatever features the range has (be it r= andom access, sortedness, or anything else) to execute fast, not the passed= in range&#39;s.<br> </blockquote> <br></div> Good point, though that reintroduces the question of comparing find&#39;s p= redicate with SortedRange&#39;s predicate.<div class=3D"im"><br></div></blo= ckquote><div><br>I really don&#39;t see how you would do that in a generic = way... Even taking into account that predicates return a very simple value = (bool) and that they terminate (well, the input range&#39;s one _was_ used = to sort it), that&#39;s akin to determining if two unknown functions produc= e equal values.<br> <br><br><br><br>=A0<br></div></div> --001636c5b4b8df7395048c5e5907--
Jul 27 2010
prev sibling next sibling parent reply Tomek =?UTF-8?B?U293acWEc2tp?= <just ask.me> writes:
Andrei Alexandrescu wrote:

 median -> r[r.length / 2]

Generalize to quantile. http://en.wikipedia.org/wiki/Quantile Tomek
Jul 25 2010
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Tomek Sowiński wrote:
 Andrei Alexandrescu wrote:
 
 median -> r[r.length / 2]

Generalize to quantile. http://en.wikipedia.org/wiki/Quantile

Makes sense. Andrei
Jul 26 2010
next sibling parent bearophile <bearophileHUGS lycos.com> writes:
Philippe Sigaud:
 btw, takeWhile is a _very_ useful function to add to std.algorithm.

I was planning to ask for it in an enhancement request. Bye, bearophile
Jul 27 2010
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Philippe Sigaud wrote:
 
 
 On Tue, Jul 27, 2010 at 07:13, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org <mailto:SeeWebsiteForEmail erdani.org>> 
 wrote:
 
     Tomek Sowiski wrote:
 
         Andrei Alexandrescu wrote:
 
             median -> r[r.length / 2]
 
 
         Generalize to quantile.
         http://en.wikipedia.org/wiki/Quantile
 
 
     Makes sense.
 
     Andrei
 
 
 
 Ah, not as methods, but free functions taking sorted ranges:
 
 
 *  fast merge (on n ranges in one go),
 * fast intersection, union, difference, symmetricDifference. But I see 
 these already exist in std.algo for sets.
 * Also, maybe, some kind of takeUpTo(range, Max), that takes all values 
 up to a certain value Max. As it's ordered, we know there is no more 
 value less than Max. I don't if there is a mathematical name for that.
 
 It's just a specialized takeWhile!((ElementType!R v) { return 
 v<Max;})(range);
 
 btw, takeWhile is a _very_ useful function to add to std.algorithm.

I definitely wanted to add that... in fact I vaguely thought I did! Could you please make a bugzilla entry so it's not forgotten? Thanks, Andrei
Jul 27 2010
prev sibling next sibling parent Philippe Sigaud <philippe.sigaud gmail.com> writes:
--0016e6daa610f527e1048c5e0547
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

On Tue, Jul 27, 2010 at 07:13, Andrei Alexandrescu <
SeeWebsiteForEmail erdani.org> wrote:

 Tomek Sowi=C5=84ski wrote:

 Andrei Alexandrescu wrote:

  median -> r[r.length / 2]

Generalize to quantile. http://en.wikipedia.org/wiki/Quantile

Makes sense. Andrei

Ah, not as methods, but free functions taking sorted ranges: * fast merge (on n ranges in one go), * fast intersection, union, difference, symmetricDifference. But I see thes= e already exist in std.algo for sets. * Also, maybe, some kind of takeUpTo(range, Max), that takes all values up to a certain value Max. As it's ordered, we know there is no more value les= s than Max. I don't if there is a mathematical name for that. It's just a specialized takeWhile!((ElementType!R v) { return v<Max;})(range); btw, takeWhile is a _very_ useful function to add to std.algorithm. Philippe --0016e6daa610f527e1048c5e0547 Content-Type: text/html; charset=ISO-8859-2 Content-Transfer-Encoding: quoted-printable <br><br><div class=3D"gmail_quote">On Tue, Jul 27, 2010 at 07:13, Andrei Al= exandrescu <span dir=3D"ltr">&lt;<a href=3D"mailto:SeeWebsiteForEmail erdan= i.org">SeeWebsiteForEmail erdani.org</a>&gt;</span> wrote:<br><blockquote c= lass=3D"gmail_quote" style=3D"margin: 0pt 0pt 0pt 0.8ex; border-left: 1px s= olid rgb(204, 204, 204); padding-left: 1ex;"> <div><div></div><div class=3D"h5">Tomek Sowi=F1ski wrote:<br> <blockquote class=3D"gmail_quote" style=3D"margin: 0pt 0pt 0pt 0.8ex; borde= r-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"> Andrei Alexandrescu wrote:<br> <br> <blockquote class=3D"gmail_quote" style=3D"margin: 0pt 0pt 0pt 0.8ex; borde= r-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"> median -&gt; r[r.length / 2]<br> </blockquote> <br> Generalize to quantile.<br> <a href=3D"http://en.wikipedia.org/wiki/Quantile" target=3D"_blank">http://= en.wikipedia.org/wiki/Quantile</a><br> </blockquote> <br></div></div> Makes sense.<br><font color=3D"#888888"> <br> Andrei<br> </font></blockquote></div><br><br>Ah, not as methods, but free functions ta= king sorted ranges:<br><br><br>*=A0 fast merge (on n ranges in one go),<br>= * fast intersection, union, difference, symmetricDifference. But I see thes= e already exist in std.algo for sets.<br> * Also, maybe, some kind of takeUpTo(range, Max), that takes all values up = to a certain value Max. As it&#39;s ordered, we know there is no more value= less than Max. I don&#39;t if there is a mathematical name for that.<br> <br>It&#39;s just a specialized takeWhile!((ElementType!R v) { return v&lt;= Max;})(range);<br><br>btw, takeWhile is a _very_ useful function to add to = std.algorithm.<br><br><br><br>Philippe<br> --0016e6daa610f527e1048c5e0547--
Jul 27 2010
prev sibling parent Philippe Sigaud <philippe.sigaud gmail.com> writes:
--0016e6d7ef7a3c3509048c97c8b7
Content-Type: text/plain; charset=ISO-8859-1

2010/7/27 Andrei Alexandrescu <SeeWebsiteForEmail erdani.org>

 btw, takeWhile is a _very_ useful function to add to std.algorithm.

I definitely wanted to add that... in fact I vaguely thought I did! Could you please make a bugzilla entry so it's not forgotten?

Better late than never... Here it is: http://d.puremagic.com/issues/show_bug.cgi?id=4535 <http://d.puremagic.com/issues/show_bug.cgi?id=4535>Philippe --0016e6d7ef7a3c3509048c97c8b7 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable <div class=3D"gmail_quote">2010/7/27 Andrei Alexandrescu <span dir=3D"ltr">= &lt;<a href=3D"mailto:SeeWebsiteForEmail erdani.org">SeeWebsiteForEmail erd= ani.org</a>&gt;</span><br><blockquote class=3D"gmail_quote" style=3D"margin= :0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"> <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p= x #ccc solid;padding-left:1ex"><div class=3D"im">btw, takeWhile is a _very_= useful function to add to std.algorithm.<br> </div></blockquote> <br> I definitely wanted to add that... in fact I vaguely thought I did! Could y= ou please make a bugzilla entry so it&#39;s not forgotten?<br></blockquote>= <div><br></div><div>Better late than never...=A0</div><div><br></div><div> Here it is:</div><div><a href=3D"http://d.puremagic.com/issues/show_bug.cgi= ?id=3D4535">http://d.puremagic.com/issues/show_bug.cgi?id=3D4535</a></div><= div><br></div><div><a href=3D"http://d.puremagic.com/issues/show_bug.cgi?id= =3D4535"></a>Philippe</div> <div>=A0</div></div> --0016e6d7ef7a3c3509048c97c8b7--
Jul 30 2010
prev sibling next sibling parent reply Lutger <lutger.blijdestijn gmail.com> writes:
Andrei Alexandrescu wrote:

 I'm trying to find justifications for keeping assumeSorted and friends
 within Phobos. Background: assumeSorted(r) where r is some range returns
 a value that wraps r and clarifies to the caller that it can assume r
 has been sorted.
 
 The obvious use case in Phobos is that find(haystack, needle) can
 proceed faster if haystack == assumeSorted(something).

I think the best justification is not in Phobos alone, but having a standard way of signaling that a range is sorted. If this is not in Phobos, other authors will have to use their own ways of communicating this precondition to the client. sortedness is general and useful enough.
 The nicest way to go about this is to make the type returned by
 assumeSorted a kind of "range with benefits". That is, it would
 implement all range primitives, plus a few more primitives that are
 advantageous for sorted ranges. Question is: what are those? What kind
 of cool primitives could a sorted range define?
 
 Here are a few I can think of:
 
 find -> uses binary search if random-access, or at least early stopping
 otherwise.
 
 minElement -> r.front
 
 maxElement -> r.back
 
 topN -> essentially does nothing
 
 median -> r[r.length / 2]
 
 Such functions could have free counterparts in std.algorithm. The free
 functions check whether the range already implements the fast functions
 and uses them transparently if present. So then we have e.g. find()
 which works in linear time for most ranges, but logarithmic time on
 random-access sorted ranges.
 
 Other ideas?
 
 
 Thanks,
 
 Andrei

I agree with Tomek Sowinski here, only the predicate is needed and perhaps a way of retrieving the original range. A small assumeSorted range seems more consistent with the general ranges approach.
Jul 25 2010
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Philippe Sigaud wrote:
 
     Andrei Alexandrescu wrote:
 
      > I'm trying to find justifications for keeping assumeSorted and
     friends
      > within Phobos. Background: assumeSorted(r) where r is some range
     returns
      > a value that wraps r and clarifies to the caller that it can assume r
      > has been sorted.
 
  
 
      > advantageous for sorted ranges. Question is: what are those? What
     kind
      > of cool primitives could a sorted range define?
      >
      > Here are a few I can think of:
      >
      > find -> uses binary search if random-access, or at least early
     stopping
      > otherwise.
      >
      > minElement -> r.front
      >
      > maxElement -> r.back
      >
      > topN -> essentially does nothing
      >
      > median -> r[r.length / 2]
 
 
 It's not exactly what you're asking for, but my first thought was: 
 "propagation". When you have a sorted range, it's rich, it's something 
 you worked for to get. You don't want to waste it. So higher-order 
 ranges and some other functions should take care to propagate the 
 information.
 
 * slicing a sorted range should produce a sorted range. so your wrapper 
 range opIndex must be modified to account for this.

Cool, makes sense.
 * ditto for .save()

Yah.
 * Take(sortedRange, 5) should indicate it's also sorted.

Fortunately take already returns the type of the slice if the range supports slice, so that's in place already.
 * The same for retro(sorted), but with a predicate being Not!pred (or 
 something akin to this anyway)

Whoa, interesting.
 * filter ! predicate (sortedRange) is also sorted
 * stride
 etc, etc.

Yah... that is all cool. My only reservations are that with what we have right now this looks like a lot of manual special casing. By the way, chain() with ranges sorted with the same predicate is also sorted... Then, such an effort would be much better motivated if sorted ranges had some really interesting properties. Aside from those discussed - there's not a lot of them!
      > Such functions could have free counterparts in std.algorithm. The
     free
      > functions check whether the range already implements the fast
     functions
      > and uses them transparently if present. So then we have e.g. find()
      > which works in linear time for most ranges, but logarithmic time on
      > random-access sorted ranges.
      >
 
 
 
 * Warning, daydreaming ahead *
 
 As a side note, I'm playing with a Graph struct. And (generic) trees are 
 just graphs, statically speaking. I can encode trees as graphs. The only 
 difference is at runtime: no loop, one ancestor per node, etc. So, it's 
 a bit like the standard range/sorted range problem: in general, being 
 sorted is a runtime property. Being able to define this at the type 
 level is quite interesting and that's what your assumeSorted does.
 
 
 But, in general, it would be nice to have a way to add flags/properties 
 as side-cars to an input type. 
 
 something like:
 
 struct WithProperty(Original, alias property)
 {
      Original _original;
      /+ insert here some alias this-like magic to make WithProperty melt 
 into an Original and expose _original most of the time +/
      /+ expose property (SortedBy!pred, ConvergesTo!someValue, etc) with 
 an alias.
 }
 
 Of course, it should be recursive: calling WithProperty on a 
 WithProperty!(SomeType, OtherProperties) should add the new property to 
 a CT-list, maybe a type-tuple.
 And there should be a way to disable some properties. As I said before, 
 filter ! pred (some sorted range) is sorted, so it should indicate it. 
 But map ! fun (sorted range) is _not_ sorted in general, and should 
 'off' this particular property.
 
 What you're doing is defining 'interfaces', a sort of CT duck-typing, 
 and your constraints templates check for the presence or absence of 
 those new functions. I now realize much of what I longed for could be 
 encoded in this way.
 
 * Convergence to a certain value => expose a .limit() primitive.
 * every value in the range occupy a certain 'range', smaller than what 
 allows ElementType!Range (constraining values between 0.0 and 1.0 for a 
 double-producing range, for example) => expose minValue/maxValue methods.
 
 OK, it's past midnight around here, tomorrow my daughters will want to 
 jump in the Mediterranean for hours, again. I'll shut up and go to bed.

Sounds like fun - of both the hacking and family kind. Keep those ideas coming, this is very interesting and something terse and usable could come out of it. Andrei
Jul 26 2010
next sibling parent reply Peter Alexander <peter.alexander.au gmail.com> writes:
Personally, I would just expose all accelerated algorithms for sorted 
ranges, and make the user call them explicitly e.g. if they know their 
range is sorted, make them call binarySearch (or whatever) instead of find.

I have two primary reasons for this:

1. Having the function choose the right algorithm puts an undue burden 
on the function writer, which translates into increased uncertainty for 
the user of the function, "If I call find(assumeSorted(r)), is it going 
to use binary search? What if I call SomeThirdParty.find? Will it know 
about assumeSorted?"

2. While we're only discussing assumeSorted now, there are a myriad of 
other assumeX wrappers that would be useful:

assumeUnimodal(range) - Allows ternary search for maxElement.
assumeDistinct(range) - uniq doesn't need to do anything.
assumePrime(int) - e.g. getFactors(assumePrime(x)) returns [1, x]
assumeAssociative(binOp) - power can work in O(lg(n)).

* power!(binOp)(a, b) := reduce!(binOp)(replicate(a, b))

Do we provide wrappers for these as well, and make the algorithms check 
for all relevant ones?

---

I say no. Provide ternarySearch, powerAssociative etc., and let the user 
choose them when he/she knows they are the right algorithm. That way, 
function writers can focus on writing specialized functions, and 
function users can rest assured that the function they are calling is 
performing the algorithm that they expect (with the complexity bounds 
that they expect).
Jul 27 2010
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Peter Alexander wrote:
 Personally, I would just expose all accelerated algorithms for sorted 
 ranges, and make the user call them explicitly e.g. if they know their 
 range is sorted, make them call binarySearch (or whatever) instead of find.
 
 I have two primary reasons for this:
 
 1. Having the function choose the right algorithm puts an undue burden 
 on the function writer, which translates into increased uncertainty for 
 the user of the function, "If I call find(assumeSorted(r)), is it going 
 to use binary search? What if I call SomeThirdParty.find? Will it know 
 about assumeSorted?"
 
 2. While we're only discussing assumeSorted now, there are a myriad of 
 other assumeX wrappers that would be useful:
 
 assumeUnimodal(range) - Allows ternary search for maxElement.
 assumeDistinct(range) - uniq doesn't need to do anything.
 assumePrime(int) - e.g. getFactors(assumePrime(x)) returns [1, x]
 assumeAssociative(binOp) - power can work in O(lg(n)).
 
 * power!(binOp)(a, b) := reduce!(binOp)(replicate(a, b))
 
 Do we provide wrappers for these as well, and make the algorithms check 
 for all relevant ones?
 
 ---
 
 I say no. Provide ternarySearch, powerAssociative etc., and let the user 
 choose them when he/she knows they are the right algorithm. That way, 
 function writers can focus on writing specialized functions, and 
 function users can rest assured that the function they are calling is 
 performing the algorithm that they expect (with the complexity bounds 
 that they expect).

I am quoting this in full because I agree with it in full! I think this and opIn_r settle the matter in favor of making the additional goodies members of Sorted!Range. A few additional thoughts: - Experience with the STL has shown that "specialized functions for improved complexity with specialized syntax" (a la find as a member vs nonmember) fare better than "best-effort functions" (a la STL's distance). - I'm considering allowing assignment to .front, .back, [k] etc. with runtime enforcement that the assignment doesn't break the ordering. There is a lure to have them actually shuffle the assigned element in the right place, but that would break many assumptions. Many algorithms assume that after r.front = x they can assert(r.front == x). - All binary search operators (lowerBound, upperBound, equalRange) should be made members of Sorted!Range. So instead of writing: // We know r is sorted, baby auto r1 = upperBound(r, x); you'd write: auto r1 = assumeSorted(r).upperBound(x); thus essentially moving the comment into the code. (I also have a cool idea for statistically checking sortedness without deteriorating complexity. Essentially assumeSorted(r) could check that r is sorted by tossing a rigged coin with P(head)=1.0/r.length and deciding to check if head comes up. That way, the amortized complexity of the check is O(1)). This all raises the question: where should this Sorted(R) in the making go? It's a range so it should go in std.range, but it's mostly a motivator for algorithms, so it should go in std.algorithm. Tiebreakers? I'm very excited about this development. Maybe I should include some of this stuff in my Google talk on Friday. Andrei
Jul 27 2010
next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Andrei Alexandrescu:
 (I also have a cool 
 idea for statistically checking sortedness without deteriorating 
 complexity. Essentially assumeSorted(r) could check that r is sorted by 
 tossing a rigged coin with P(head)=1.0/r.length and deciding to check if 
 head comes up. That way, the amortized complexity of the check is O(1)).

But the safety (reliability) of such test is low, and it can increase a lot the variance of the running time of a piece of code... So it's a cute idea, but I am not sure it's a good one. Bye, bearophile
Jul 27 2010
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
bearophile wrote:
 Andrei Alexandrescu:
 (I also have a cool idea for statistically checking sortedness
 without deteriorating complexity. Essentially assumeSorted(r) could
 check that r is sorted by tossing a rigged coin with
 P(head)=1.0/r.length and deciding to check if head comes up. That
 way, the amortized complexity of the check is O(1)).

But the safety (reliability) of such test is low, and it can increase a lot the variance of the running time of a piece of code... So it's a cute idea, but I am not sure it's a good one.

Some reliability is considerably to have than patently none especially when approached in a principled manner (i.e. engineered to not influence complexity), and I don't think the variance argument holds except for synthetic cases (very few searches in very large ranges). Anyway, I think I'll insert the checks only in the debug version. Andrei
Jul 27 2010
prev sibling next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Jonathan M Davis wrote:
 On Tuesday, July 27, 2010 07:05:11 Andrei Alexandrescu wrote:
 This all raises the question: where should this Sorted(R) in the making
 go? It's a range so it should go in std.range, but it's mostly a
 motivator for algorithms, so it should go in std.algorithm. Tiebreakers?

It's a range, so put it in std.range. It's already likely pretty common to be importing both anyway, and std.algorithm has more in it than std.range at this point. Sure, it may be used with std.algorithm, but someone may have their own functions that they'd want to use it with without std.algorithm. I suppose that it is a bit borderline as to which module it should go in, but I'd argue that since it's a range that isn't associated with any particular function (like Find! or Until! or whatnot), it should go in std.range.

I agree with the argument. I just fear that someone wants binary search, searches the std.algorithm document for "binary", doesn't find any, and concludes it doesn't exist. I think I should drop such names as "binarySearch", "lowerBound" etc. in the documentation of std.algorithm and have the terms xref to std.range. Andrei
Jul 27 2010
prev sibling parent reply Tomek =?UTF-8?B?U293acWEc2tp?= <just ask.me> writes:
Andrei Alexandrescu wrote:

 - All binary search operators (lowerBound, upperBound, equalRange)
 should be made members of Sorted!Range. So instead of writing:
 
 // We know r is sorted, baby
 auto r1 = upperBound(r, x);
 
 you'd write:
 
 auto r1 = assumeSorted(r).upperBound(x);

Some of std.algorithm (e.g. find, partition) can profit on sortedness, others (e.g. group, setIntersection, setDifference, nWayUnion) require it. Do you want to put *all* of them as members of Sorted?
 This all raises the question: where should this Sorted(R) in the making
 go? It's a range so it should go in std.range, but it's mostly a
 motivator for algorithms, so it should go in std.algorithm. Tiebreakers?

If members are algorithms then in algorithm... Tomek
Jul 27 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Tomek Sowiński wrote:
 Andrei Alexandrescu wrote:
 
 - All binary search operators (lowerBound, upperBound, equalRange)
 should be made members of Sorted!Range. So instead of writing:

 // We know r is sorted, baby
 auto r1 = upperBound(r, x);

 you'd write:

 auto r1 = assumeSorted(r).upperBound(x);

Some of std.algorithm (e.g. find, partition) can profit on sortedness, others (e.g. group, setIntersection, setDifference, nWayUnion) require it. Do you want to put *all* of them as members of Sorted?

Affirmative.
 This all raises the question: where should this Sorted(R) in the making
 go? It's a range so it should go in std.range, but it's mostly a
 motivator for algorithms, so it should go in std.algorithm. Tiebreakers?

If members are algorithms then in algorithm...

Yah but the type is a range. Andrei
Jul 27 2010
parent reply Tomek =?UTF-8?B?U293acWEc2tp?= <just ask.me> writes:
Philippe Sigaud wrote:

 Yah but the type is a range.

But then, so are Map and Filter. Personally, I do not consider them algorithms per se, they are too low-level: they're building blocks and, as ranges, should go in std.range. LevenshteinDistance or bringToFront, those are algorithms. My own experience of this is that most of the time, I import std.range only to get map/filter/reduce. That is, without reaching for a real algorithm. They are in std.algo for 'historical' reasons, because std.algo precedes std.range.

OK, range then. From a different angle, I was thinking that assumptions should give more; look at this: auto range = [2, 3, 4]; auto sorted = assumeSorted(r); range[0] = 666; // BANG! assumeSorted replaced range with Range.init sorted[0] = 666; // BANG! assertion in Sorted's setter went off range = sorted.dropAssumption; sorted[0] = 666; // BANG! now sorted's range reference is nulled. range[0] = 666; // OK... phew! Assumption producers like sort() would take the range by ref to comply with this idiom. Of course if you save() the range before it's eaten by assumeSorted, you can twiddle away through the saved range. Still, it's better than nothing. Oh, and the unique-like behavior can be expanded onto assumeUnique and, I dare say, assumeWhatever. Like it? Tomek
Jul 28 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Tomek Sowiński wrote:
 Philippe Sigaud wrote:
 
 Yah but the type is a range.

But then, so are Map and Filter. Personally, I do not consider them algorithms per se, they are too low-level: they're building blocks and, as ranges, should go in std.range. LevenshteinDistance or bringToFront, those are algorithms. My own experience of this is that most of the time, I import std.range only to get map/filter/reduce. That is, without reaching for a real algorithm. They are in std.algo for 'historical' reasons, because std.algo precedes std.range.

OK, range then. From a different angle, I was thinking that assumptions should give more; look at this: auto range = [2, 3, 4]; auto sorted = assumeSorted(r); range[0] = 666; // BANG! assumeSorted replaced range with Range.init sorted[0] = 666; // BANG! assertion in Sorted's setter went off range = sorted.dropAssumption; sorted[0] = 666; // BANG! now sorted's range reference is nulled. range[0] = 666; // OK... phew! Assumption producers like sort() would take the range by ref to comply with this idiom. Of course if you save() the range before it's eaten by assumeSorted, you can twiddle away through the saved range. Still, it's better than nothing. Oh, and the unique-like behavior can be expanded onto assumeUnique and, I dare say, assumeWhatever. Like it? Tomek

Like it, just one issue - destructive copy is unlikely to solve any problems. The big problems arise with long-distance aliasing, much less from messing with the source of the assignment directly. Andrei
Jul 28 2010
parent Tomek =?UTF-8?B?U293acWEc2tp?= <just ask.me> writes:
Andrei Alexandrescu wrote:

 Tomek Sowiński wrote:
 From a different angle, I was thinking that assumptions should give more;
 look at this:
 
 auto range = [2, 3, 4];
 auto sorted = assumeSorted(r);
 range[0] = 666; // BANG! assumeSorted replaced range with Range.init
 sorted[0] = 666; // BANG! assertion in Sorted's setter went off
 range = sorted.dropAssumption;
 sorted[0] = 666;  // BANG! now sorted's range reference is nulled.
 range[0] = 666;  // OK... phew!
 
 Assumption producers like sort() would take the range by ref to comply
 with this idiom. Of course if you save() the range before it's eaten by
 assumeSorted, you can twiddle away through the saved range. Still, it's
 better than nothing. Oh, and the unique-like behavior can be expanded
 onto assumeUnique and, I dare say, assumeWhatever.
 
 Like it?
  
 
 Tomek

Like it, just one issue - destructive copy is unlikely to solve any problems. The big problems arise with long-distance aliasing, much less from messing with the source of the assignment directly.

I know. Yet, it does no harm and is better than what is now. Tomek
Jul 28 2010
prev sibling next sibling parent Jonathan M Davis <jmdavisprog gmail.com> writes:
On Tuesday, July 27, 2010 07:05:11 Andrei Alexandrescu wrote:
 This all raises the question: where should this Sorted(R) in the making
 go? It's a range so it should go in std.range, but it's mostly a
 motivator for algorithms, so it should go in std.algorithm. Tiebreakers?

It's a range, so put it in std.range. It's already likely pretty common to be importing both anyway, and std.algorithm has more in it than std.range at this point. Sure, it may be used with std.algorithm, but someone may have their own functions that they'd want to use it with without std.algorithm. I suppose that it is a bit borderline as to which module it should go in, but I'd argue that since it's a range that isn't associated with any particular function (like Find! or Until! or whatnot), it should go in std.range. - Jonathan M Davis
Jul 27 2010
prev sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 27 Jul 2010 12:59:00 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Jonathan M Davis wrote:
 On Tuesday, July 27, 2010 07:05:11 Andrei Alexandrescu wrote:
 This all raises the question: where should this Sorted(R) in the making
 go? It's a range so it should go in std.range, but it's mostly a
 motivator for algorithms, so it should go in std.algorithm.  
 Tiebreakers?

common to be importing both anyway, and std.algorithm has more in it than std.range at this point. Sure, it may be used with std.algorithm, but someone may have their own functions that they'd want to use it with without std.algorithm. I suppose that it is a bit borderline as to which module it should go in, but I'd argue that since it's a range that isn't associated with any particular function (like Find! or Until! or whatnot), it should go in std.range.

I agree with the argument. I just fear that someone wants binary search, searches the std.algorithm document for "binary", doesn't find any, and concludes it doesn't exist. I think I should drop such names as "binarySearch", "lowerBound" etc. in the documentation of std.algorithm and have the terms xref to std.range.

That's what indexes are for :) -Steve
Jul 27 2010
prev sibling next sibling parent reply KennyTM~ <kennytm gmail.com> writes:
On Jul 27, 10 13:20, Andrei Alexandrescu wrote:
 Philippe Sigaud wrote:
 Andrei Alexandrescu wrote:


 Yah... that is all cool. My only reservations are that with what we have
 right now this looks like a lot of manual special casing. By the way,
 chain() with ranges sorted with the same predicate is also sorted...

No chain([4,5,6,7], [1,2]) // <-- not sorted unless you can also assert input[i-1].back <= input[i].front.
 Then, such an effort would be much better motivated if sorted ranges had
 some really interesting properties. Aside from those discussed - there's
 not a lot of them!

 Andrei

Jul 27 2010
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
KennyTM~ wrote:
 On Jul 27, 10 13:20, Andrei Alexandrescu wrote:
 Philippe Sigaud wrote:
 Andrei Alexandrescu wrote:


 Yah... that is all cool. My only reservations are that with what we have
 right now this looks like a lot of manual special casing. By the way,
 chain() with ranges sorted with the same predicate is also sorted...

No chain([4,5,6,7], [1,2]) // <-- not sorted unless you can also assert input[i-1].back <= input[i].front.

Oops. Thanks for the correction. Andrei
Jul 27 2010
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Philippe Sigaud wrote:
 On Tue, Jul 27, 2010 at 07:20, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org <mailto:SeeWebsiteForEmail erdani.org>> 
 wrote:
 
 
     * slicing a sorted range should produce a sorted range. so your
     wrapper range opIndex must be modified to account for this.
 
     Cool, makes sense.
 
 
 
 Another thing, though I'm not sure it's a good idea, it to have them 
 define an opIndex[ElementType!R e], that would just either forward to 
 .find() or do a simple binary search by itself: O(lg(n)) is quite good.
  
 Yeah, isAssociativeRange!R  ;)
 But that's blurring the line between container and range, I guess.

One issue is dealing with sorted random-access ranges of size_t. Then there's ambiguity - which [] did you mean, searching on straight indexing?
 In the same vein, a sorted range could optionally define opIn_r that 
 also promises O(lg(n)) complexity.

This is a strong argument for putting sorted range functionality inside Sorted!Range.
 As for taking ideas from other languages: in Clojure, data structures 
 for which it makes sense (maps, sets?) are functions of their element 
 and return true if the provided value is in the structure. That allows 
 one to use them as predicate for finding and such, which makes for very 
 clean code, as they also have encoded literal values.

Not getting it, could you please expand/insert a link?
 Btw, should they really be called "sorted" or "ordered"? "sorted" 
 implies there was a sorting action, which is not always the case. It may 
 very well have been created that way.

Not sure, but as a putative user, with "sorted" I know 100% what's going on; with the slightly oblique "ordered" I might be worried enough to look up the manual. Andrei
Jul 27 2010
next sibling parent bearophile <bearophileHUGS lycos.com> writes:
Philippe Sigaud:

 But the idea is that if you want to filter a sequence to find all vowels for
example:
 
 auto vowels = set("aeiou");
 auto vowelsInMyText = filter!vowels(myText);
 
 vowels act as a predicate: if you call it with a char, it will returns true
 iff the char is in vowels. In Clojure, it produces very clean code.

In Python you can give a lamba function to filter(), or you can use: vowels = set("aeiou") txt = "hello how are you" print filter(vowels.__contains__, txt) __contains__ is the standard method used by the Python "in" operator. That code is a bit less nice than the Clojure code, but it's a bit more explicit (and you can use other member functions beside __contains__). This is similar code in D, but it doesn't work: import std.range, std.algorithm, std.array; void main() { auto vowels = ['a':0, 'e':0, 'i':0, 'o':0, 'u':0]; string txt = "hello how are you"; auto f = filter!(&vowels.opBinary!("in"))(txt); writeln(array(f)); } I don't know if there is some way to make that code work. Bye, bearophile
Jul 28 2010
prev sibling parent bearophile <bearophileHUGS lycos.com> writes:
Philippe Sigaud:

 auto vowelsInMyText = filter ! q{a in assumeSorted("aeiou")} (myText);

In a so small array of chars (or even wchars/dchars) a sequential search (if well implemented!) is the fastest algorithm. Bye, bearophile
Jul 28 2010
prev sibling next sibling parent Philippe Sigaud <philippe.sigaud gmail.com> writes:
--0016e6daa625a4a4c4048c5e4e52
Content-Type: text/plain; charset=ISO-8859-1

On Tue, Jul 27, 2010 at 07:20, Andrei Alexandrescu <
SeeWebsiteForEmail erdani.org> wrote:

 * slicing a sorted range should produce a sorted range. so your wrapper
 range opIndex must be modified to account for this.

 Cool, makes sense.

Another thing, though I'm not sure it's a good idea, it to have them define an opIndex[ElementType!R e], that would just either forward to .find() or do a simple binary search by itself: O(lg(n)) is quite good. Yeah, isAssociativeRange!R ;) But that's blurring the line between container and range, I guess. In the same vein, a sorted range could optionally define opIn_r that also promises O(lg(n)) complexity. As for taking ideas from other languages: in Clojure, data structures for which it makes sense (maps, sets?) are functions of their element and return true if the provided value is in the structure. That allows one to use them as predicate for finding and such, which makes for very clean code, as they also have encoded literal values.
  * ditto for .save()

Yah. * Take(sortedRange, 5) should indicate it's also sorted.

Fortunately take already returns the type of the slice if the range supports slice, so that's in place already.

I was about to reply that not all sorted ranges have slicing, but I think it's better to restrict this to 'rich' ranges: Random-Access and slicing. So, OK.
  * The same for retro(sorted), but with a predicate being Not!pred (or
 something akin to this anyway)

Whoa, interesting.

It's the only case where I found that you could act on the predicate without analyzing it.
  * filter ! predicate (sortedRange) is also sorted
 * stride
 etc, etc.

Yah... that is all cool. My only reservations are that with what we have right now this looks like a lot of manual special casing. By the way, chain() with ranges sorted with the same predicate is also sorted... Then, such an effort would be much better motivated if sorted ranges had some really interesting properties. Aside from those discussed - there's not a lot of them!

Yes, I agree. Maybe for retro() it's interesting, as retro could replace foreach_reverse... Btw, should they really be called "sorted" or "ordered"? "sorted" implies there was a sorting action, which is not always the case. It may very well have been created that way. Maybe (just maybe, I'm extending myself there) it's because you tend to use ranges this way: - read from a file some content over which you had no control. - act on the read stuff, most probably sort it to prepare for the real action - do the heavy-duty computation. Whereas I tend to produce my own ranges, so I design them to be ordered/sorted, always positive or whatever. Philippe --0016e6daa625a4a4c4048c5e4e52 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable <div class=3D"gmail_quote">On Tue, Jul 27, 2010 at 07:20, Andrei Alexandres= cu <span dir=3D"ltr">&lt;<a href=3D"mailto:SeeWebsiteForEmail erdani.org">S= eeWebsiteForEmail erdani.org</a>&gt;</span> wrote:<br><blockquote class=3D"= gmail_quote" style=3D"margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb= (204, 204, 204); padding-left: 1ex;"> <br><div class=3D"im">* slicing a sorted range should produce a sorted rang= e. so your wrapper range opIndex must be modified to account for this.<br> </div> <br> Cool, makes sense.<br> <br></blockquote><div><br><br>Another thing, though I&#39;m not sure it&#39= ;s a good idea, it to have them define an opIndex[ElementType!R e], that wo= uld just either forward to .find() or do a simple binary search by itself: = O(lg(n)) is quite good.<br> =A0<br>Yeah, isAssociativeRange!R=A0 ;)<br>But that&#39;s blurring the line= between container and range, I guess. <br><br><br>In the same vein, a sort= ed range could optionally define opIn_r that also promises O(lg(n)) complex= ity. <br> <br>As for taking ideas from other languages: in Clojure, data structures= =20 for which it makes sense (maps, sets?) are functions of their element and return true if the provided value is in the structure. That allows one=20 to use them as predicate for finding and such, which makes for very=20 clean code, as they also have encoded literal values. <br> <br><br>=A0</div><blockquote class=3D"gmail_quote" style=3D"margin: 0pt 0pt= 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"> <blockquote class=3D"gmail_quote" style=3D"margin: 0pt 0pt 0pt 0.8ex; borde= r-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"> * ditto for .save()<br> </blockquote> <br> Yah.<div class=3D"im"><br> <br> <blockquote class=3D"gmail_quote" style=3D"margin: 0pt 0pt 0pt 0.8ex; borde= r-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"> * Take(sortedRange, 5) should indicate it&#39;s also sorted.<br> </blockquote> <br></div> Fortunately take already returns the type of the slice if the range support= s slice, so that&#39;s in place already.</blockquote><div><br>I was about t= o reply that not all sorted ranges have slicing, but I think it&#39;s bette= r to restrict this to &#39;rich&#39; ranges: Random-Access and slicing. So,= OK.<br> <br><br></div><blockquote class=3D"gmail_quote" style=3D"margin: 0pt 0pt 0p= t 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"><di= v class=3D"im"><br> <br> <blockquote class=3D"gmail_quote" style=3D"margin: 0pt 0pt 0pt 0.8ex; borde= r-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"> * The same for retro(sorted), but with a predicate being Not!pred (or somet= hing akin to this anyway)<br> </blockquote> <br></div> Whoa, interesting.</blockquote><div><br>It&#39;s the only case where I foun= d that you could act on the predicate without analyzing it. <br>=A0</div><b= lockquote class=3D"gmail_quote" style=3D"margin: 0pt 0pt 0pt 0.8ex; border-= left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"> <div class=3D"im"><br> <br> <blockquote class=3D"gmail_quote" style=3D"margin: 0pt 0pt 0pt 0.8ex; borde= r-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"> * filter ! predicate (sortedRange) is also sorted<br> * stride<br> etc, etc.<br> </blockquote> <br></div> Yah... that is all cool. My only reservations are that with what we have ri= ght now this looks like a lot of manual special casing. By the way, chain()= with ranges sorted with the same predicate is also sorted...<br> <br> Then, such an effort would be much better motivated if sorted ranges had so= me really interesting properties. Aside from those discussed - there&#39;s = not a lot of them!<br></blockquote><div><br>Yes, I agree. Maybe for retro()= it&#39;s interesting, as retro could replace foreach_reverse...<br> <br>Btw, should they really be called &quot;sorted&quot; or &quot;ordered&q= uot;? &quot;sorted&quot; implies there was a sorting action, which is not a= lways the case. It may very well have been created that way.<br><br>Maybe (= just maybe, I&#39;m extending myself there) it&#39;s because you tend to us= e ranges this way:<br> <br>- read from a file some content over which you had no control.<br>- act= on the read stuff, most probably sort it to prepare for the real action<br=
- do the heavy-duty computation.<br><br>Whereas I tend to produce my own r=

br> <br><br>Philippe<br><br></div></div> --0016e6daa625a4a4c4048c5e4e52--
Jul 27 2010
prev sibling next sibling parent Philippe Sigaud <philippe.sigaud gmail.com> writes:
--0015173ff38e6271df048c71b925
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

On Wed, Jul 28, 2010 at 00:18, Andrei Alexandrescu <
SeeWebsiteForEmail erdani.org> wrote:

 Tomek Sowi=C5=84ski wrote:

 Andrei Alexandrescu wrote:

 - All binary search operators (lowerBound, upperBound, equalRange)
 should be made members of Sorted!Range. So instead of writing:

 // We know r is sorted, baby
 auto r1 =3D upperBound(r, x);

 you'd write:

 auto r1 =3D assumeSorted(r).upperBound(x);

Some of std.algorithm (e.g. find, partition) can profit on sortedness, others (e.g. group, setIntersection, setDifference, nWayUnion) require i=


 Do you want to put *all* of them as members of Sorted?

Affirmative.

btw, std.algorithm.sort does an in-place sort and as such does not affect the input type. It might be interesting to call it sortInPlace instead (or anything that has your fancy) and have sort() _returns_ a Sorted!Range and automatically get all the goodies. Also, the .sort properties for arrays may be ditched and use std.algo instead, by way of a free function in std.array. And, while I'm at it, most of these primitives do not work for infinite ranges, which may perfectly be sorted, have random-access and slicing. Case in point: the natural numbers 0,1,2,... I guess infinity will not be accepted in that case... If that's so, my impression is that most algorithms will require something very array-like: random-access && hasLength && hasSlicing && !isInfinite. That may be interesting to produce a new template constraint: template isArrayLike(R) // aka "Rich Range" {...}
 This all raises the question: where should this Sorted(R) in the making
 go? It's a range so it should go in std.range, but it's mostly a
 motivator for algorithms, so it should go in std.algorithm. Tiebreakers=




If members are algorithms then in algorithm...

Yah but the type is a range.

But then, so are Map and Filter. Personally, I do not consider them algorithms per se, they are too low-level: they're building blocks and, as ranges, should go in std.range. LevenshteinDistance or bringToFront, those are algorithms. My own experience of this is that most of the time, I import std.range _only_ to get map/filter/reduce. That is, without reaching for a real algorithm. They are in std.algo for 'historical' reasons, because std.algo precedes std.range. Philippe --0015173ff38e6271df048c71b925 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable <div class=3D"gmail_quote"><br><br><div class=3D"gmail_quote">On Wed, Jul 2= 8, 2010 at 00:18, Andrei Alexandrescu=C2=A0<span dir=3D"ltr">&lt;<a href=3D= "mailto:SeeWebsiteForEmail erdani.org">SeeWebsiteForEmail erdani.org</a>&gt= ;</span>=C2=A0wrote:<br> <blockquote class=3D"gmail_quote" style=3D"margin-top: 0px; margin-right: 0= px; margin-bottom: 0px; margin-left: 0.8ex; border-left-width: 1px; border-= left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex= ; "> <div class=3D"im">Tomek Sowi=C5=84ski wrote:<br><blockquote class=3D"gmail_= quote" style=3D"margin-top: 0px; margin-right: 0px; margin-bottom: 0px; mar= gin-left: 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 2= 04); border-left-style: solid; padding-left: 1ex; "> Andrei Alexandrescu wrote:<br><br><blockquote class=3D"gmail_quote" style= =3D"margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0.= 8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-= left-style: solid; padding-left: 1ex; "> - All binary search operators (lowerBound, upperBound, equalRange)<br>shoul= d be made members of Sorted!Range. So instead of writing:<br><br>// We know= r is sorted, baby<br>auto r1 =3D upperBound(r, x);<br><br>you&#39;d write:= <br> <br>auto r1 =3D assumeSorted(r).upperBound(x);<br></blockquote><br>Some of = std.algorithm (e.g. find, partition) can profit on sortedness, others (e.g.= group, setIntersection, setDifference, nWayUnion) require it. Do you want = to put *all* of them as members of Sorted?<br> </blockquote><br></div>Affirmative.</blockquote><div><br></div><div><br></d= iv><div>btw, std.algorithm.sort does an in-place sort and as such does not = affect the input type. It might be interesting to call it sortInPlace inste= ad (or anything that has your fancy) and have sort() _returns_ a Sorted!Ran= ge and automatically get all the goodies.</div> <div><br></div><div>Also, the .sort properties for arrays may be ditched an= d use std.algo instead, by way of a free function in std.array.</div><div><= br></div><div><br></div><div>And, while I&#39;m at it, most of these primit= ives do not work for infinite ranges, which may perfectly be sorted, have r= andom-access and slicing. Case in point: the natural numbers 0,1,2,...</div=

that case... If that&#39;s so, my impression is that most algorithms will = require something very array-like: random-access &amp;&amp; hasLength &amp;= &amp; hasSlicing &amp;&amp; !isInfinite.</div> <div>That may be interesting to produce a new template constraint:</div><di= v><br></div><div>template isArrayLike(R) // aka &quot;Rich Range&quot;</div=
<div>{...}</div><div><br></div><div>=C2=A0</div><blockquote class=3D"gmail=

rgin-left: 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, = 204); border-left-style: solid; padding-left: 1ex; "> <div class=3D"im"><br><br><blockquote class=3D"gmail_quote" style=3D"margin= -top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0.8ex; borde= r-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style= : solid; padding-left: 1ex; "> <blockquote class=3D"gmail_quote" style=3D"margin-top: 0px; margin-right: 0= px; margin-bottom: 0px; margin-left: 0.8ex; border-left-width: 1px; border-= left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex= ; "> This all raises the question: where should this Sorted(R) in the making<br>= go? It&#39;s a range so it should go in std.range, but it&#39;s mostly a<br=
motivator for algorithms, so it should go in std.algorithm. Tiebreakers?<b=

</blockquote><br>If members are algorithms then in algorithm...<br></blockq= uote><br></div>Yah but the type is a range.</blockquote><div><br></div><div=
But then, so are Map and Filter.=C2=A0</div><div>Personally, I do not cons=

locks and, as ranges, should go in std.range. LevenshteinDistance or bringT= oFront, those are algorithms.=C2=A0</div> <div><br></div><div>My own experience of this is that most of the time, I i= mport std.range _only_ to get map/filter/reduce. That is, without reaching = for a real algorithm. They are in std.algo for &#39;historical&#39; reasons= , because std.algo precedes std.range.</div> <div><br></div><div><br></div><div>Philippe</div><div><br></div><div><br></= div></div></div> --0015173ff38e6271df048c71b925--
Jul 28 2010
prev sibling parent Philippe Sigaud <philippe.sigaud gmail.com> writes:
--0016e6daa62517ec78048c725bb0
Content-Type: text/plain; charset=ISO-8859-1

On Tue, Jul 27, 2010 at 15:48, Andrei Alexandrescu <
SeeWebsiteForEmail erdani.org> wrote:

 Another thing, though I'm not sure it's a good idea, it to have them
 define an opIndex[ElementType!R e], that would just either forward to
 .find() or do a simple binary search by itself: O(lg(n)) is quite good.
  Yeah, isAssociativeRange!R  ;)
 But that's blurring the line between container and range, I guess.

One issue is dealing with sorted random-access ranges of size_t. Then there's ambiguity - which [] did you mean, searching on straight indexing?

ranges of indices are quite useful. OK, carry on.
  In the same vein, a sorted range could optionally define opIn_r that also
 promises O(lg(n)) complexity.

This is a strong argument for putting sorted range functionality inside Sorted!Range.

determine if its argument range is sorted will not use compile-time duck typing, with a isSortedRange!R, but will just see if the input is a Sorted!T? In that case, indicate quite clearly in Sorted docs that if the user knows a range is sorted, then she really ought to wrap it in Sorted. Anyway, Sorted will be an interesting example of a wrapper struct that transmits type-encoded information, like I wanted to have. Seeing how it works in practice will be interesting. It's really a way to provide at compile-time information on the (runtime) values. Though what I wanted is a proper subtype, whereas in Sorted!R case, Sorted is parallel to R, but due to the duck typing used everywhere, it will get along. I'm making any sense there? As a little brother to Sorted, you could have Bounded!R, that signifies the ranges values won't take all possible values of their element type: a range producing doubles, but all positive, for example, or a char range with all values in the 'a'-'z' range... Exposed methods: minBound, maxBound These do not tell you what is the max or the min of the range, they may not even exist if the range is infinite. It just tell you that all values will be between minBound and maxBound) Filter may be enough for this, but it's complicated to extract the min and max values. Another one I like is Periodic... (or, Cyclic). Right now, some of my algorithms, like rotateWhile!predicate(range) detect Cycle!R and Repeat!R and deal with it accordingly. I do not want rotateWhile!"a<6"(cycle([0,1,2,3])) to rotate without stopping.
  As for taking ideas from other languages: in Clojure, data structures for
 which it makes sense (maps, sets?) are functions of their element and return
 true if the provided value is in the structure. That allows one to use them
 as predicate for finding and such, which makes for very clean code, as they
 also have encoded literal values.

Not getting it, could you please expand/insert a link?

http://clojure.org/data_structures (maps are functions of their keys, sets are functions of their elements) and http://clojure.org/sequences <http://clojure.org/data_structures> That will fail in D, due to your remark on range with size_t elements that prevents using opIndex to make a range an associative range. But the idea is that if you want to filter a sequence to find all vowels for example: auto vowels = set("aeiou"); auto vowelsInMyText = filter!vowels(myText); vowels act as a predicate: if you call it with a char, it will returns true iff the char is in vowels. In Clojure, it produces very clean code. I guess in D, you'd do: auto vowelsInMyText!((dchar d) { return d in vowels;})(myText); or even: auto vowelsInMyText = filter ! q{a in assumeSorted("aeiou")} (myText); And I just convinced myself that the D way is good enough :) Philippe --0016e6daa62517ec78048c725bb0 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable <div class=3D"gmail_quote">On Tue, Jul 27, 2010 at 15:48, Andrei Alexandres= cu <span dir=3D"ltr">&lt;<a href=3D"mailto:SeeWebsiteForEmail erdani.org" t= arget=3D"_blank">SeeWebsiteForEmail erdani.org</a>&gt;</span> wrote:<br><bl= ockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #= ccc solid;padding-left:1ex"> <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p= x #ccc solid;padding-left:1ex"><div><br></div><div> Another thing, though I&#39;m not sure it&#39;s a good idea, it to have the= m define an opIndex[ElementType!R e], that would just either forward to .fi= nd() or do a simple binary search by itself: O(lg(n)) is quite good.<br> =A0Yeah, isAssociativeRange!R =A0;)<br> But that&#39;s blurring the line between container and range, I guess.<br> </div></blockquote> <br> One issue is dealing with sorted random-access ranges of size_t. Then there= &#39;s ambiguity - which [] did you mean, searching on straight indexing?<d= iv><br></div></blockquote><div><br></div><div>You&#39;re right. And I canno= t discard the notion of size_t ranges, because ranges of indices are quite = useful.</div> <div>OK, carry on.</div><div><br></div><div>=A0</div><blockquote class=3D"g= mail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-l= eft:1ex"><div> <br> <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p= x #ccc solid;padding-left:1ex"> In the same vein, a sorted range could optionally define opIn_r that also p= romises O(lg(n)) complexity.<br> </blockquote> <br></div> This is a strong argument for putting sorted range functionality inside Sor= ted!Range.<div><br></div></blockquote><div><br></div><div>=A0agree. Does th= at also means that in that case, an algorithm trying to determine if its ar= gument range is sorted will not use compile-time duck typing, with a isSort= edRange!R, but will just see if the input is a Sorted!T?</div> <div><br></div><div>In that case, indicate quite clearly in Sorted docs tha= t if the user knows a range is sorted, then she really ought to wrap it in = Sorted.</div><div><br></div><div>Anyway, Sorted will be an interesting exam= ple of a wrapper struct that transmits type-encoded information, like I wan= ted to have. Seeing how it works in practice will be interesting. It&#39;s = really a way to provide at compile-time information on the (runtime) values= .</div> <div><br></div><div>Though what I wanted is a proper subtype, whereas in So= rted!R case, Sorted is parallel to R, but due to the duck typing used every= where, it will get along.=A0</div><div>I&#39;m making any sense there?</div=

have Bounded!R, that signifies the ranges values won&#39;t take all possib= le values of their element type: a range producing doubles, but all positiv= e, for example, or a char range with all values in the &#39;a&#39;-&#39;z&#= 39; range...</div> <div>Exposed methods: minBound, maxBound</div><div><br></div><div>These do = not tell you what is the max or the min of the range, they may not even exi= st if the range is infinite. It just tell you that all values will be betwe= en minBound and maxBound)</div> <div>Filter may be enough for this, but it&#39;s complicated to extract the= min and max values.</div><div><br></div><div>Another one I like is Periodi= c... (or, Cyclic). Right now, some of my algorithms, like rotateWhile!predi= cate(range) detect Cycle!R and Repeat!R and deal with it accordingly.=A0</d= iv> <div>I do not want</div><div><br></div><div>rotateWhile!&quot;a&lt;6&quot;(= cycle([0,1,2,3]))=A0</div><div><br></div><div>to rotate without stopping.= =A0</div><div><br></div><div>=A0</div><blockquote class=3D"gmail_quote" sty= le=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> <div> <br> <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p= x #ccc solid;padding-left:1ex"> As for taking ideas from other languages: in Clojure, data structures for w= hich it makes sense (maps, sets?) are functions of their element and return= true if the provided value is in the structure. That allows one to use the= m as predicate for finding and such, which makes for very clean code, as th= ey also have encoded literal values.<br> </blockquote> <br></div> Not getting it, could you please expand/insert a link?</blockquote><div><br=
</div><div><br></div><div><a href=3D"http://clojure.org/data_structures">h=

s, sets are functions of their elements)</div> <div><br></div><div>and</div><div><br></div><div><a href=3D"http://clojure.= org/sequences">http://clojure.org/sequences</a></div><div><a href=3D"http:/= /clojure.org/data_structures"></a>=A0</div><div><div>That will fail in D, d= ue to your remark on range with size_t elements that prevents using opIndex= to make a range an associative range. But the idea is that if you want to = filter a sequence to find all vowels for example:</div> <div><br></div><div>auto vowels =3D set(&quot;aeiou&quot;);</div><div>auto = vowelsInMyText =3D filter!vowels(myText);</div><div><br></div><div>vowels a= ct as a predicate: if you call it with a char, it will returns true iff the= char is in vowels. In Clojure, it produces very clean code. I guess in D, = you&#39;d do:</div> <div><br></div><div>auto vowelsInMyText!((dchar d) { return d in vowels;})(= myText);</div><div><br></div><div>or even:</div><div><br></div><div>auto vo= welsInMyText =3D filter ! q{a in assumeSorted(&quot;aeiou&quot;)} (myText);= =A0</div> </div><div><br></div><div>And I just convinced myself that the D way is goo= d enough :)</div><div><br></div><div><br></div><div>Philippe</div><div><br>= </div></div><br> --0016e6daa62517ec78048c725bb0--
Jul 28 2010
prev sibling parent Philippe Sigaud <philippe.sigaud gmail.com> writes:
--0015175cd450ce4f58048c3df2b3
Content-Type: text/plain; charset=ISO-8859-1

 Andrei Alexandrescu wrote:

 I'm trying to find justifications for keeping assumeSorted and friends
 within Phobos. Background: assumeSorted(r) where r is some range returns
 a value that wraps r and clarifies to the caller that it can assume r
 has been sorted.


 advantageous for sorted ranges. Question is: what are those? What kind
 of cool primitives could a sorted range define?

 Here are a few I can think of:

 find -> uses binary search if random-access, or at least early stopping
 otherwise.

 minElement -> r.front

 maxElement -> r.back

 topN -> essentially does nothing

 median -> r[r.length / 2]


It's not exactly what you're asking for, but my first thought was: "propagation". When you have a sorted range, it's rich, it's something you worked for to get. You don't want to waste it. So higher-order ranges and some other functions should take care to propagate the information. * slicing a sorted range should produce a sorted range. so your wrapper range opIndex must be modified to account for this. * ditto for .save() * Take(sortedRange, 5) should indicate it's also sorted. * The same for retro(sorted), but with a predicate being Not!pred (or something akin to this anyway) * filter ! predicate (sortedRange) is also sorted * stride etc, etc.
 Such functions could have free counterparts in std.algorithm. The free
 functions check whether the range already implements the fast functions
 and uses them transparently if present. So then we have e.g. find()
 which works in linear time for most ranges, but logarithmic time on
 random-access sorted ranges.


* Warning, daydreaming ahead * As a side note, I'm playing with a Graph struct. And (generic) trees are just graphs, statically speaking. I can encode trees as graphs. The only difference is at runtime: no loop, one ancestor per node, etc. So, it's a bit like the standard range/sorted range problem: in general, being sorted is a runtime property. Being able to define this at the type level is quite interesting and that's what your assumeSorted does. But, in general, it would be nice to have a way to add flags/properties as side-cars to an input type. something like: struct WithProperty(Original, alias property) { Original _original; /+ insert here some alias this-like magic to make WithProperty melt into an Original and expose _original most of the time +/ /+ expose property (SortedBy!pred, ConvergesTo!someValue, etc) with an alias. } Of course, it should be recursive: calling WithProperty on a WithProperty!(SomeType, OtherProperties) should add the new property to a CT-list, maybe a type-tuple. And there should be a way to disable some properties. As I said before, filter ! pred (some sorted range) is sorted, so it should indicate it. But map ! fun (sorted range) is _not_ sorted in general, and should 'off' this particular property. What you're doing is defining 'interfaces', a sort of CT duck-typing, and your constraints templates check for the presence or absence of those new functions. I now realize much of what I longed for could be encoded in this way. * Convergence to a certain value => expose a .limit() primitive. * every value in the range occupy a certain 'range', smaller than what allows ElementType!Range (constraining values between 0.0 and 1.0 for a double-producing range, for example) => expose minValue/maxValue methods. OK, it's past midnight around here, tomorrow my daughters will want to jump in the Mediterranean for hours, again. I'll shut up and go to bed. Philippe --0015175cd450ce4f58048c3df2b3 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable <div class=3D"gmail_quote"><br><blockquote class=3D"gmail_quote" style=3D"m= argin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); paddin= g-left: 1ex;"><div class=3D"im">Andrei Alexandrescu wrote:<br> <br> &gt; I&#39;m trying to find justifications for keeping assumeSorted and fri= ends<br> &gt; within Phobos. Background: assumeSorted(r) where r is some range retur= ns<br> &gt; a value that wraps r and clarifies to the caller that it can assume r<= br> &gt; has been sorted.</div></blockquote><div>=A0</div><blockquote class=3D"= gmail_quote" style=3D"margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb= (204, 204, 204); padding-left: 1ex;"><div class=3D"im">&gt; advantageous fo= r sorted ranges. Question is: what are those? What kind<br> &gt; of cool primitives could a sorted range define?<br> &gt;<br> &gt; Here are a few I can think of:<br> &gt;<br> &gt; find -&gt; uses binary search if random-access, or at least early stop= ping<br> &gt; otherwise.<br> &gt;<br> &gt; minElement -&gt; r.front<br> &gt;<br> &gt; maxElement -&gt; r.back<br> &gt;<br> &gt; topN -&gt; essentially does nothing<br> &gt;<br> &gt; median -&gt; r[r.length / 2]<br> </div></blockquote><div><br>It&#39;s not exactly what you&#39;re asking for= , but my first thought was: &quot;propagation&quot;. When=20 you have a sorted range, it&#39;s rich, it&#39;s something you worked for t= o=20 get. You don&#39;t want to waste it. So higher-order ranges and some other = functions should take care to propagate the information.<br> <br> * slicing a sorted range should produce a sorted range. so your wrapper ran= ge opIndex must be modified to account for this.<br>* ditto for .save()<br>= <br> * Take(sortedRange, 5) should indicate it&#39;s also sorted.<br> * The same for retro(sorted), but with a predicate being Not!pred (or=20 something akin to this anyway) <br> * filter ! predicate (sortedRange) is also sorted<br>* stride<br> etc, etc.<br> <br><br></div><blockquote class=3D"gmail_quote" style=3D"margin: 0pt 0pt 0p= t 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"><di= v class=3D"im">&gt;<br> &gt; Such functions could have free counterparts in std.algorithm. The free= <br> &gt; functions check whether the range already implements the fast function= s<br> &gt; and uses them transparently if present. So then we have e.g. find()<br=


&gt;<br></div></blockquote><div><br><br>* Warning, daydreaming ahead * <br>= <br></div></div>As a side note, I&#39;m playing with a Graph struct. And (g= eneric) trees are just graphs, statically speaking. I can encode trees as g= raphs. The only difference is at runtime: no loop, one ancestor per node, e= tc. So, it&#39;s a bit like the standard range/sorted range problem: in gen= eral, being sorted is a runtime property. Being able to define this at the = type level is quite interesting and that&#39;s=A0what your assumeSorted doe= s.<br> <br><br>But, in general, it would be nice to have a way to add flags/proper= ties as side-cars to an input type.=A0<br><br>something like:<br><br>struct= WithProperty(Original, alias property)<br>{<br>=A0=A0=A0=A0 Original _orig= inal;<br> =A0=A0=A0=A0 /+ insert here some alias this-like magic to make WithProperty= melt into an Original and expose _original most of the time +/<br>=A0=A0= =A0 =A0/+ expose property (SortedBy!pred, ConvergesTo!someValue, etc) with = an alias. <br> }<br><br>Of course, it should be recursive: calling WithProperty on a WithP= roperty!(SomeType, OtherProperties) should add the new property to a CT-lis= t, maybe a type-tuple.<br>And there should be a way to disable some propert= ies. As I said before, filter ! pred (some sorted range) is sorted, so it s= hould indicate it. But map ! fun (sorted range) is _not_ sorted in general,= and should &#39;off&#39; this particular property.<br> <br>What you&#39;re doing is defining &#39;interfaces&#39;, a sort of CT du= ck-typing, and your constraints templates check for the presence or absence= of those new functions. I now realize much of what I longed for could be e= ncoded in this way. <br> <br>* Convergence to a certain value =3D&gt; expose a .limit() primitive.<b= r>* every value in the range occupy a certain &#39;range&#39;, smaller than= what allows ElementType!Range (constraining values between 0.0 and 1.0 for= a double-producing range, for example) =3D&gt; expose minValue/maxValue me= thods. <br> <br>OK, it&#39;s past midnight around here, tomorrow my daughters will want= to jump in the Mediterranean for hours, again.=A0I&#39;ll shut up and go t= o bed.<br><br><br>Philippe<br><br> --0015175cd450ce4f58048c3df2b3--
Jul 25 2010