www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - PoC: Cached function calls

reply Chris Nicholson-Sauls <ibisbasenji gmail.com> writes:
A discussion on this group caused me to add a new feature to a hypothetical
scripting 
language I've been toying around with for a couple months or so.  It looks,
basically, 
like this:

# cache Int fibidx (Int n) {
#   if (n < 2) return n;
#   else       return fibidx(n - 1) + fibidx(n - 2);
# }

The nifty feature is the keyword 'cache' up there, which causes the function
'fibidx' to 
be evaluated /exactly once/ for a given parameter list.  (Granted in the case
of Fibonacci 
one can easily accomplish what I'm talking about with a class, but I thought
it'd make a 
simple demonstration case.)

So I got to thinking... wouldn't it be nifty to have this, or something like
it, in D! 
With that in mind, I tried to think up a way it might be accomplished -- I
hadn't yet 
actually tried to do anything with any of D's nifty new templating features,
namely 
tuples.  Well, I've learned a few things anyhow.  :)  The following actually
works!

# import std .traits ;
#
# struct TCachedFunc (alias Func) { static:
#
#   alias ReturnType         !(Func) Ret    ;
#   alias ParameterTypeTuple !(Func) Params ;
#
#   static if (is(typeof(Params[1]))) {
#     // multiple parameters
#
#     private struct Node {
#       Params params ;
#     }
#
#     private Ret[Node] p_cache ;
#
#     Ret opCall (Params args) {
#       Node node   ;
#       Ret* result ;
#
#       foreach (i, x; args) {
#         node.params[i] = x;
#       }
#       result = node in p_cache;
#
#       if (!result) {
#         p_cache[node] = Func(args);
#         result = node in p_cache;
#       }
#       return *result;
#     }
#   }
#   else {
#     // single parameter
#
#     alias Params[0] Param ;
#
#     private Ret[Param] p_cache ;
#
#     Ret opCall (Param arg) {
#       Ret* result = arg in p_cache;
#
#       if (!result) {
#         p_cache[arg] = Func(arg);
#         result = arg in p_cache;
#       }
#       return *result;
#     }
#   }
# }

Given a 'fibidx' D function like the one above, one may use this template to
create a 
cached version by simply aliasing the template.  For example:
# alias TCachedFunc!(fibidx) fibidx_cached;

Then just call it like normal.  I timed such a function as a means of testing
the 
template.  The results follow (in order: normal function call, initial use of
cache, a 
second use of the cache):
<Benchmark TCachedFunc> Baseline 17.520000
<Benchmark TCachedFunc> Time 16.810000 & 1.042237 versus baseline
<Benchmark TCachedFunc> Time 0.000000 & inf versus baseline

Just, wow.  That said, it really only exhibits any benefits for functions of
some 
reasonable complexity, or with deep recursion that eats up cycles (like a fib
function 
;)).  Anything that would normally be inlined by the compiler will definitely
perform 
better with a normal call.

Anyhow, I just thought someone might find it interesting.  Maybe even useful. 
I'm 
considering it for inclusion in Cashew, once I figure out how to properly do
this with a 
delegate as well.

-- Chris Nicholson-Sauls
Dec 15 2006
next sibling parent Kyle Furlong <kylefurlong gmail.com> writes:
Chris Nicholson-Sauls wrote:
 A discussion on this group caused me to add a new feature to a 
 hypothetical scripting language I've been toying around with for a 
 couple months or so.  It looks, basically, like this:
 
 # cache Int fibidx (Int n) {
 #   if (n < 2) return n;
 #   else       return fibidx(n - 1) + fibidx(n - 2);
 # }
 
 The nifty feature is the keyword 'cache' up there, which causes the 
 function 'fibidx' to be evaluated /exactly once/ for a given parameter 
 list.  (Granted in the case of Fibonacci one can easily accomplish what 
 I'm talking about with a class, but I thought it'd make a simple 
 demonstration case.)
 
 So I got to thinking... wouldn't it be nifty to have this, or something 
 like it, in D! With that in mind, I tried to think up a way it might be 
 accomplished -- I hadn't yet actually tried to do anything with any of 
 D's nifty new templating features, namely tuples.  Well, I've learned a 
 few things anyhow.  :)  The following actually works!
 
 # import std .traits ;
 #
 # struct TCachedFunc (alias Func) { static:
 #
 #   alias ReturnType         !(Func) Ret    ;
 #   alias ParameterTypeTuple !(Func) Params ;
 #
 #   static if (is(typeof(Params[1]))) {
 #     // multiple parameters
 #
 #     private struct Node {
 #       Params params ;
 #     }
 #
 #     private Ret[Node] p_cache ;
 #
 #     Ret opCall (Params args) {
 #       Node node   ;
 #       Ret* result ;
 #
 #       foreach (i, x; args) {
 #         node.params[i] = x;
 #       }
 #       result = node in p_cache;
 #
 #       if (!result) {
 #         p_cache[node] = Func(args);
 #         result = node in p_cache;
 #       }
 #       return *result;
 #     }
 #   }
 #   else {
 #     // single parameter
 #
 #     alias Params[0] Param ;
 #
 #     private Ret[Param] p_cache ;
 #
 #     Ret opCall (Param arg) {
 #       Ret* result = arg in p_cache;
 #
 #       if (!result) {
 #         p_cache[arg] = Func(arg);
 #         result = arg in p_cache;
 #       }
 #       return *result;
 #     }
 #   }
 # }
 
 Given a 'fibidx' D function like the one above, one may use this 
 template to create a cached version by simply aliasing the template.  
 For example:
 # alias TCachedFunc!(fibidx) fibidx_cached;
 
 Then just call it like normal.  I timed such a function as a means of 
 testing the template.  The results follow (in order: normal function 
 call, initial use of cache, a second use of the cache):
 <Benchmark TCachedFunc> Baseline 17.520000
 <Benchmark TCachedFunc> Time 16.810000 & 1.042237 versus baseline
 <Benchmark TCachedFunc> Time 0.000000 & inf versus baseline
 
 Just, wow.  That said, it really only exhibits any benefits for 
 functions of some reasonable complexity, or with deep recursion that 
 eats up cycles (like a fib function ;)).  Anything that would normally 
 be inlined by the compiler will definitely perform better with a normal 
 call.
 
 Anyhow, I just thought someone might find it interesting.  Maybe even 
 useful.  I'm considering it for inclusion in Cashew, once I figure out 
 how to properly do this with a delegate as well.
 
 -- Chris Nicholson-Sauls

Cool!
Dec 16 2006
prev sibling next sibling parent reply Lutger <lutger.blijdestijn gmail.com> writes:
Chris Nicholson-Sauls wrote:
 A discussion on this group caused me to add a new feature to a 
 hypothetical scripting language I've been toying around with for a 
 couple months or so.  It looks, basically, like this:
 
 # cache Int fibidx (Int n) {
 #   if (n < 2) return n;
 #   else       return fibidx(n - 1) + fibidx(n - 2);
 # }
 
 The nifty feature is the keyword 'cache' up there, which causes the 
 function 'fibidx' to be evaluated /exactly once/ for a given parameter 
 list.  (Granted in the case of Fibonacci one can easily accomplish what 
 I'm talking about with a class, but I thought it'd make a simple 
 demonstration case.)
 
 So I got to thinking... wouldn't it be nifty to have this, or something 
 like it, in D! With that in mind, I tried to think up a way it might be 
 accomplished -- I hadn't yet actually tried to do anything with any of 
 D's nifty new templating features, namely tuples.  Well, I've learned a 
 few things anyhow.  :)  The following actually works!
 
 # import std .traits ;
 #
 # struct TCachedFunc (alias Func) { static:
 #
 #   alias ReturnType         !(Func) Ret    ;
 #   alias ParameterTypeTuple !(Func) Params ;
 #
 #   static if (is(typeof(Params[1]))) {
 #     // multiple parameters
 #
 #     private struct Node {
 #       Params params ;
 #     }
 #
 #     private Ret[Node] p_cache ;
 #
 #     Ret opCall (Params args) {
 #       Node node   ;
 #       Ret* result ;
 #
 #       foreach (i, x; args) {
 #         node.params[i] = x;
 #       }
 #       result = node in p_cache;
 #
 #       if (!result) {
 #         p_cache[node] = Func(args);
 #         result = node in p_cache;
 #       }
 #       return *result;
 #     }
 #   }
 #   else {
 #     // single parameter
 #
 #     alias Params[0] Param ;
 #
 #     private Ret[Param] p_cache ;
 #
 #     Ret opCall (Param arg) {
 #       Ret* result = arg in p_cache;
 #
 #       if (!result) {
 #         p_cache[arg] = Func(arg);
 #         result = arg in p_cache;
 #       }
 #       return *result;
 #     }
 #   }
 # }
 
 Given a 'fibidx' D function like the one above, one may use this 
 template to create a cached version by simply aliasing the template.  
 For example:
 # alias TCachedFunc!(fibidx) fibidx_cached;
 
 Then just call it like normal.  I timed such a function as a means of 
 testing the template.  The results follow (in order: normal function 
 call, initial use of cache, a second use of the cache):
 <Benchmark TCachedFunc> Baseline 17.520000
 <Benchmark TCachedFunc> Time 16.810000 & 1.042237 versus baseline
 <Benchmark TCachedFunc> Time 0.000000 & inf versus baseline
 
 Just, wow.  That said, it really only exhibits any benefits for 
 functions of some reasonable complexity, or with deep recursion that 
 eats up cycles (like a fib function ;)).  Anything that would normally 
 be inlined by the compiler will definitely perform better with a normal 
 call.
 
 Anyhow, I just thought someone might find it interesting.  Maybe even 
 useful.  I'm considering it for inclusion in Cashew, once I figure out 
 how to properly do this with a delegate as well.
 
 -- Chris Nicholson-Sauls

That is pretty cool. The technique is called memoization iirc. One problem is that functions in D are not guaranteed to be referentially transparent, thus for some class of functions the result will be incorrect (also there have to be no side-effects of course). But the user could determine that calling the function with the same arguments will always lead to the same result, so I think it is useful anyway if you are aware of that, thanks.
Dec 16 2006
parent =?ISO-8859-1?Q?Jari-Matti_M=E4kel=E4?= <jmjmak utu.fi.invalid> writes:
Lutger wrote:
 That is pretty cool. The technique is called memoization iirc. One
 problem is that functions in D are not guaranteed to be referentially
 transparent, thus for some class of functions the result will be
 incorrect (also there have to be no side-effects of course). But the
 user could determine that calling the function with the same arguments
 will always lead to the same result, so I think it is useful anyway if
 you are aware of that, thanks.

Dynamic programming and memoization might came handy when doing intensive string processing with templates. I did some benchmarking with runtime versions of "longest common substring" in D a while ago. It might be pretty easy to convert them into templates. Haven't done much template magic, though.
Dec 16 2006
prev sibling next sibling parent "Kristian Kilpi" <kjkilpi gmail.com> writes:
On Sat, 16 Dec 2006 08:11:09 +0200, Chris Nicholson-Sauls  =

<ibisbasenji gmail.com> wrote:
 A discussion on this group caused me to add a new feature to a  =

 hypothetical scripting language I've been toying around with for a  =

 couple months or so.  It looks, basically, like this:

 # cache Int fibidx (Int n) {
 #   if (n < 2) return n;
 #   else       return fibidx(n - 1) + fibidx(n - 2);
 # }

 The nifty feature is the keyword 'cache' up there, which causes the  =

 function 'fibidx' to be evaluated /exactly once/ for a given parameter=

 list.  (Granted in the case of Fibonacci one can easily accomplish wha=

 I'm talking about with a class, but I thought it'd make a simple  =

 demonstration case.)

 So I got to thinking... wouldn't it be nifty to have this, or somethin=

 like it, in D! With that in mind, I tried to think up a way it might b=

 accomplished -- I hadn't yet actually tried to do anything with any of=

 D's nifty new templating features, namely tuples.  Well, I've learned =

 few things anyhow.  :)  The following actually works!

 # import std .traits ;
 #
 # struct TCachedFunc (alias Func) { static:
 #
 #   alias ReturnType         !(Func) Ret    ;
 #   alias ParameterTypeTuple !(Func) Params ;
 #
 #   static if (is(typeof(Params[1]))) {
 #     // multiple parameters
 #
 #     private struct Node {
 #       Params params ;
 #     }
 #
 #     private Ret[Node] p_cache ;
 #
 #     Ret opCall (Params args) {
 #       Node node   ;
 #       Ret* result ;
 #
 #       foreach (i, x; args) {
 #         node.params[i] =3D x;
 #       }
 #       result =3D node in p_cache;
 #
 #       if (!result) {
 #         p_cache[node] =3D Func(args);
 #         result =3D node in p_cache;
 #       }
 #       return *result;
 #     }
 #   }
 #   else {
 #     // single parameter
 #
 #     alias Params[0] Param ;
 #
 #     private Ret[Param] p_cache ;
 #
 #     Ret opCall (Param arg) {
 #       Ret* result =3D arg in p_cache;
 #
 #       if (!result) {
 #         p_cache[arg] =3D Func(arg);
 #         result =3D arg in p_cache;
 #       }
 #       return *result;
 #     }
 #   }
 # }

 Given a 'fibidx' D function like the one above, one may use this  =

 template to create a cached version by simply aliasing the template.  =

 For example:
 # alias TCachedFunc!(fibidx) fibidx_cached;

 Then just call it like normal.  I timed such a function as a means of =

 testing the template.  The results follow (in order: normal function  =

 call, initial use of cache, a second use of the cache):
 <Benchmark TCachedFunc> Baseline 17.520000
 <Benchmark TCachedFunc> Time 16.810000 & 1.042237 versus baseline
 <Benchmark TCachedFunc> Time 0.000000 & inf versus baseline

 Just, wow.  That said, it really only exhibits any benefits for  =

 functions of some reasonable complexity, or with deep recursion that  =

 eats up cycles (like a fib function ;)).  Anything that would normally=

 be inlined by the compiler will definitely perform better with a norma=

 call.

 Anyhow, I just thought someone might find it interesting.  Maybe even =

 useful.  I'm considering it for inclusion in Cashew, once I figure out=

 how to properly do this with a delegate as well.

 -- Chris Nicholson-Sauls

Heheh, nice! I was considering implementing similar thing in C++ a while= = ago. The solution is so simple in D. :)
Dec 17 2006
prev sibling parent reply janderson <askme me.com> writes:
Nice, I really like these type of algorithms.  I once wrote a C++ one to 
move the vtable of objects to a local area in memory to avoid double 
seeking.   One optimisation you may consider (if it hasn't already been 
added is:

Change
 #         p_cache[node] = Func(args);
 #         result = node in p_cache;

To
 #         result = Func(args);
 #	    p_cache[node] = result;

That way you avoid an unnecessary lookups. Unless D is smart enough to do the optimisation itself which would be nice but unlikely. Chris Nicholson-Sauls wrote:
 # import std .traits ;
 #
 # struct TCachedFunc (alias Func) { static:
 #
 #   alias ReturnType         !(Func) Ret    ;
 #   alias ParameterTypeTuple !(Func) Params ;
 #
 #   static if (is(typeof(Params[1]))) {
 #     // multiple parameters
 #
 #     private struct Node {
 #       Params params ;
 #     }
 #
 #     private Ret[Node] p_cache ;
 #
 #     Ret opCall (Params args) {
 #       Node node   ;
 #       Ret* result ;
 #
 #       foreach (i, x; args) {
 #         node.params[i] = x;
 #       }
 #       result = node in p_cache;
 #
 #       if (!result) {
 #         p_cache[node] = Func(args);
 #         result = node in p_cache;
 #       }
 #       return *result;
 #     }
 #   }
 #   else {
 #     // single parameter
 #
 #     alias Params[0] Param ;
 #
 #     private Ret[Param] p_cache ;
 #
 #     Ret opCall (Param arg) {
 #       Ret* result = arg in p_cache;
 #
 #       if (!result) {
 #         p_cache[arg] = Func(arg);
 #         result = arg in p_cache;
 #       }
 #       return *result;
 #     }
 #   }
 # }

 -- Chris Nicholson-Sauls

Dec 17 2006
parent Chris Nicholson-Sauls <ibisbasenji gmail.com> writes:
janderson wrote:
 Nice, I really like these type of algorithms.  I once wrote a C++ one to 
 move the vtable of objects to a local area in memory to avoid double 
 seeking.   One optimisation you may consider (if it hasn't already been 
 added is:
 
 Change
  > #         p_cache[node] = Func(args);
  > #         result = node in p_cache;
 
 To
 
  > #         result = Func(args);
  > #        p_cache[node] = result;

Actually I ended up changing these to 'return p_cache[node] = Func(args);'. Even quicker. :) The only problem with your suggested way is that 'result' is a pointer, and I can't take the address of a return value (I don't think...). -- Chris Nicholson-Sauls
Dec 17 2006