www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Caching in computing ranges

reply Tomek =?UTF-8?B?U293acWEc2tp?= <just ask.me> writes:
I've been having fun with ranges lately. While nesting computing ranges I
noticed only the 
outermost range's cache is necessary; there's no way of accessing front() of
ranges deeper 
in the expression twice because they are sealed by the outermost range. Example:

map!"a._0 + a._1"(    // caches, front() may be called twice
   zip(    // oh, trumpet: front() is called only to compute outer map's cache
       map!"a*a"([2, 4, 5, 6]),    // oh, trumpet
       take(sequence!"a._0 + n * a._1"(1, 2), 4)    // oh, trumpet
   )
);

Eliminating superfluous caches, among other benefits, allows inlining the
(usually cheap) 
front()s. One way to do it is to parametrize computing ranges with an enum
Cached.(yes|no). 
What you think?

-- 
Tomek
Oct 09 2010
next sibling parent reply Pelle <pelle.mansson gmail.com> writes:
On 10/09/2010 09:38 PM, Tomek Sowiński wrote:
 I've been having fun with ranges lately. While nesting computing ranges I
noticed only the
 outermost range's cache is necessary; there's no way of accessing front() of
ranges deeper
 in the expression twice because they are sealed by the outermost range.
Example:

 map!"a._0 + a._1"(    // caches, front() may be called twice
     zip(    // oh, trumpet: front() is called only to compute outer map's cache
         map!"a*a"([2, 4, 5, 6]),    // oh, trumpet
         take(sequence!"a._0 + n * a._1"(1, 2), 4)    // oh, trumpet
     )
 );

 Eliminating superfluous caches, among other benefits, allows inlining the
(usually cheap)
 front()s. One way to do it is to parametrize computing ranges with an enum
Cached.(yes|no).
 What you think?

How about never having caches and if you need them, you can get a cached(map(..etc..))? I do not understand where the notion that ranges should have caches come from. It destroys the possibility for using const and complicates the writing of them.
Oct 09 2010
next sibling parent reply Peter Alexander <peter.alexander.au gmail.com> writes:
On 9/10/10 8:58 PM, Pelle wrote:
 How about never having caches and if you need them, you can get a
 cached(map(..etc..))?

 I do not understand where the notion that ranges should have caches come
 from. It destroys the possibility for using const and complicates the
 writing of them.

I agree with this. Caching should not be a default. As it stands, I know that std.algorithm.Map uses a cache, I haven't checked any others.
Oct 09 2010
next sibling parent reply Tomek =?UTF-8?B?U293acWEc2tp?= <just ask.me> writes:
Peter Alexander napisał:

 On 9/10/10 8:58 PM, Pelle wrote:
 How about never having caches and if you need them, you can get a
 cached(map(..etc..))?

 I do not understand where the notion that ranges should have caches come
 from. It destroys the possibility for using const and complicates the
 writing of them.

I agree with this. Caching should not be a default. As it stands, I know that std.algorithm.Map uses a cache, I haven't checked any others.

std.range.Sequence also caches. I swear I saw Zip do it too, but looking at the code now, I see no cache. By the way, I've been reading Phobos quite intensively lately and I'm willing to spend some time to make the change Pelle was talking about (remove caching, add the Cached higher- order range). How can I contribute that to Phobos? -- Tomek
Oct 09 2010
next sibling parent reply Peter Alexander <peter.alexander.au gmail.com> writes:
On 9/10/10 10:06 PM, Tomek Sowiński wrote:
 How can I contribute that to Phobos?

I believe you have been accepted as a trusted commiter by some people (Walter and Andrei?). Most people appear to earn trust by writing a high quality library.
Oct 09 2010
parent bearophile <bearophileHUGS lycos.com> writes:
Peter Alexander:

 I believe you have been accepted as a trusted commiter by some people 
 (Walter and Andrei?). Most people appear to earn trust by writing a high 
 quality library.

Right. While DMD is currently NOT managed as a true Open Source project (and this is very bad for the future and development of the D language, and will need to be eventually fixed if D wants a bit more success), Phobos2 is becoming open source enough, there are several good and gentle people working on Phobos2, Andrei is gentle and accepts the code he receives, etc. So Phobos2 may avoid the fate of Phobos1 (where the frustration of people unable to push patches upstream has lead to the creation of Tango). Becoming more Open Source (where people can patch the code written by others) can't guarantee a future to D, but the opposite is worse. Walter is now accepting more patches and suggestions from Don, so the situation is improving a bit regarding the compiler too, so I am happy and hopeful :-) Bye, bearophile
Oct 09 2010
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/9/10 16:06 CDT, Tomek Sowiński wrote:
 Peter Alexander napisał:

 On 9/10/10 8:58 PM, Pelle wrote:
 How about never having caches and if you need them, you can get a
 cached(map(..etc..))?

 I do not understand where the notion that ranges should have caches come
 from. It destroys the possibility for using const and complicates the
 writing of them.

I agree with this. Caching should not be a default. As it stands, I know that std.algorithm.Map uses a cache, I haven't checked any others.

std.range.Sequence also caches. I swear I saw Zip do it too, but looking at the code now, I see no cache. By the way, I've been reading Phobos quite intensively lately and I'm willing to spend some time to make the change Pelle was talking about (remove caching, add the Cached higher- order range). How can I contribute that to Phobos?

Thanks for offering. Patches or simple snippets meant to replace well-defined parts of existing code should be fine. But in this case (i.e. caching) where the replacement is trivial, a bug report would be sufficient and most appreciated. Andrei
Oct 09 2010
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/9/10 15:12 CDT, Peter Alexander wrote:
 On 9/10/10 8:58 PM, Pelle wrote:
 How about never having caches and if you need them, you can get a
 cached(map(..etc..))?

 I do not understand where the notion that ranges should have caches come
 from. It destroys the possibility for using const and complicates the
 writing of them.

I agree with this. Caching should not be a default. As it stands, I know that std.algorithm.Map uses a cache, I haven't checked any others.

It's the only one I can remember. Andrei
Oct 09 2010
prev sibling parent Tomek =?UTF-8?B?U293acWEc2tp?= <just ask.me> writes:
Pelle napisał:

 On 10/09/2010 09:38 PM, Tomek Sowiński wrote:
 I've been having fun with ranges lately. While nesting computing ranges I
 noticed only the outermost range's cache is necessary; there's no way of
 accessing front() of ranges deeper in the expression twice because they
 are sealed by the outermost range. Example:

 map!"a._0 + a._1"(    // caches, front() may be called twice
     zip(    // oh, trumpet: front() is called only to compute outer map's
     cache
         map!"a*a"([2, 4, 5, 6]),    // oh, trumpet
         take(sequence!"a._0 + n * a._1"(1, 2), 4)    // oh, trumpet
     )
 );

 Eliminating superfluous caches, among other benefits, allows inlining the
 (usually cheap) front()s. One way to do it is to parametrize computing
 ranges with an enum Cached.(yes|no). What you think?

How about never having caches and if you need them, you can get a cached(map(..etc..))?

That's an excellent idea.
 I do not understand where the notion that ranges should have caches come
 from. It destroys the possibility for using const and complicates the
 writing of them.

Plus, computations are done in popFront(). Then skippers like Stride compute many values in vain. -- Tomek
Oct 09 2010
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/9/10 14:38 CDT, Tomek Sowiński wrote:
 I've been having fun with ranges lately. While nesting computing ranges I
noticed only the
 outermost range's cache is necessary; there's no way of accessing front() of
ranges deeper
 in the expression twice because they are sealed by the outermost range.
Example:

 map!"a._0 + a._1"(    // caches, front() may be called twice
     zip(    // oh, trumpet: front() is called only to compute outer map's cache
         map!"a*a"([2, 4, 5, 6]),    // oh, trumpet
         take(sequence!"a._0 + n * a._1"(1, 2), 4)    // oh, trumpet
     )
 );

 Eliminating superfluous caches, among other benefits, allows inlining the
(usually cheap)
 front()s. One way to do it is to parametrize computing ranges with an enum
Cached.(yes|no).
 What you think?

I think it's a good idea. In fact let's eliminate caching altogether; it does not happen for the random access operator anyway and I think it's fair to simply evaluate the function whenever an element is accessed. Whaddaya think? Andrei
Oct 09 2010
next sibling parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 On 10/9/10 14:38 CDT, Tomek Sowiński wrote:
 I've been having fun with ranges lately. While nesting computing ranges I


 outermost range's cache is necessary; there's no way of accessing front() of


 in the expression twice because they are sealed by the outermost range.
Example:

 map!"a._0 + a._1"(    // caches, front() may be called twice
     zip(    // oh, trumpet: front() is called only to compute outer map's cache
         map!"a*a"([2, 4, 5, 6]),    // oh, trumpet
         take(sequence!"a._0 + n * a._1"(1, 2), 4)    // oh, trumpet
     )
 );

 Eliminating superfluous caches, among other benefits, allows inlining the


 front()s. One way to do it is to parametrize computing ranges with an enum


 What you think?

does not happen for the random access operator anyway and I think it's fair to simply evaluate the function whenever an element is accessed. Whaddaya think? Andrei

Vote++. When I added bidirectional support for map(), I felt like any solution for doing bidirectional caching was going to suck. I considered just eliminating it, but left it because at the time I was more interested in just getting bidirectional/random access working than worrying about the cache issue. IMHO the only range that should be cached is a Cached higher order range.
Oct 09 2010
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/9/10 18:20 CDT, dsimcha wrote:
 Vote++.  When I added bidirectional support for map(), I felt like any solution
 for doing bidirectional caching was going to suck.  I considered just
eliminating
 it, but left it because at the time I was more interested in just getting
 bidirectional/random access working than worrying about the cache issue.  IMHO
the
 only range that should be cached is a Cached higher order range.

Cached will be quite a story in and of itself. I can tell you for sure we need two related abstractions in std.range: 1. Lookback!Range provides a history that allows the range to look back up to n items: auto r = lookback(file.byLine(), 20); ... r.lookback(3); // three lines before the current 2. Lookahead!Range similarly allows looking ahead: auto r = lookahead(file.byLine(), 20); ... r.lookahead(3); // three lines after the current Andrei
Oct 09 2010
prev sibling next sibling parent Peter Alexander <peter.alexander.au gmail.com> writes:
On 9/10/10 11:40 PM, Andrei Alexandrescu wrote:
 I think it's a good idea. In fact let's eliminate caching altogether; it
 does not happen for the random access operator anyway and I think it's
 fair to simply evaluate the function whenever an element is accessed.
 Whaddaya think?

 Andrei

Yep, kill it.
Oct 09 2010
prev sibling parent "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Sat, 09 Oct 2010 18:43:54 -0500, Andrei Alexandrescu wrote:

 On 10/9/10 18:20 CDT, dsimcha wrote:
 Vote++.  When I added bidirectional support for map(), I felt like any
 solution for doing bidirectional caching was going to suck.  I
 considered just eliminating it, but left it because at the time I was
 more interested in just getting bidirectional/random access working
 than worrying about the cache issue.  IMHO the only range that should
 be cached is a Cached higher order range.

Cached will be quite a story in and of itself. I can tell you for sure we need two related abstractions in std.range: 1. Lookback!Range provides a history that allows the range to look back up to n items: auto r = lookback(file.byLine(), 20); ... r.lookback(3); // three lines before the current 2. Lookahead!Range similarly allows looking ahead: auto r = lookahead(file.byLine(), 20); ... r.lookahead(3); // three lines after the current Andrei

A PushBack!Range that allows 'un-popping' elements could also be useful. -Lars
Oct 11 2010